Overview

Dataset statistics

Number of variables54
Number of observations1104972
Missing cells8264711
Missing cells (%)13.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory455.2 MiB
Average record size in memory432.0 B

Variable types

Numeric6
DateTime4
Text15
Categorical29

Alerts

OrdenLaboratorioDetalleId is highly overall correlated with NumeroControlHigh correlation
Region is highly overall correlated with Departamento and 1 other fieldsHigh correlation
NumeroControl is highly overall correlated with OrdenLaboratorioDetalleIdHigh correlation
Departamento is highly overall correlated with Region and 1 other fieldsHigh correlation
DepartamentoResidencia is highly overall correlated with Region and 1 other fieldsHigh correlation
Fiebre is highly overall correlated with TosHigh correlation
Tos is highly overall correlated with Fiebre and 2 other fieldsHigh correlation
Cefalea is highly overall correlated with Rinorrea and 2 other fieldsHigh correlation
Rinorrea is highly overall correlated with Tos and 2 other fieldsHigh correlation
DolorGarganta is highly overall correlated with Tos and 3 other fieldsHigh correlation
DolorMuscular is highly overall correlated with Cefalea and 1 other fieldsHigh correlation
PerdidaOlfato is highly overall correlated with PerdidaGustoHigh correlation
PerdidaGusto is highly overall correlated with PerdidaOlfatoHigh correlation
TipoEdad is highly imbalanced (93.4%)Imbalance
Prueba is highly imbalanced (63.7%)Imbalance
Resultado is highly imbalanced (58.3%)Imbalance
PerdidaOlfato is highly imbalanced (69.9%)Imbalance
PerdidaGusto is highly imbalanced (75.9%)Imbalance
Diarrea is highly imbalanced (82.9%)Imbalance
OtroSintoma is highly imbalanced (86.8%)Imbalance
HipertensionArterial is highly imbalanced (64.4%)Imbalance
Diabetes is highly imbalanced (76.0%)Imbalance
EnfermedadPulmonarCronica is highly imbalanced (93.7%)Imbalance
Obesidad is highly imbalanced (85.3%)Imbalance
Asma is highly imbalanced (82.8%)Imbalance
EnfermedadRenalCronica is highly imbalanced (95.3%)Imbalance
Inmunosupresion is highly imbalanced (94.6%)Imbalance
AlcoholismoCronico is highly imbalanced (96.5%)Imbalance
EnfermedadNeurologicaCronica is highly imbalanced (96.6%)Imbalance
Tabaquismo is highly imbalanced (94.5%)Imbalance
Embarazo is highly imbalanced (82.0%)Imbalance
DepartamentoResidencia has 202752 (18.3%) missing valuesMissing
MunicipioResidencia has 202763 (18.4%) missing valuesMissing
Telefono has 202750 (18.3%) missing valuesMissing
CT has 951212 (86.1%) missing valuesMissing
FechaInicioSintomas has 210319 (19.0%) missing valuesMissing
Asintomatico has 259707 (23.5%) missing valuesMissing
Fiebre has 259707 (23.5%) missing valuesMissing
Tos has 259707 (23.5%) missing valuesMissing
Disnea has 259707 (23.5%) missing valuesMissing
Cefalea has 259707 (23.5%) missing valuesMissing
Rinorrea has 259707 (23.5%) missing valuesMissing
DolorGarganta has 259707 (23.5%) missing valuesMissing
DolorMuscular has 259707 (23.5%) missing valuesMissing
PerdidaOlfato has 259707 (23.5%) missing valuesMissing
PerdidaGusto has 259707 (23.5%) missing valuesMissing
Diarrea has 259707 (23.5%) missing valuesMissing
OtroSintoma has 259707 (23.5%) missing valuesMissing
EspecifiqueOtro has 259707 (23.5%) missing valuesMissing
HipertensionArterial has 259707 (23.5%) missing valuesMissing
Diabetes has 259707 (23.5%) missing valuesMissing
EnfermedadPulmonarCronica has 259707 (23.5%) missing valuesMissing
Obesidad has 259707 (23.5%) missing valuesMissing
Asma has 259707 (23.5%) missing valuesMissing
EnfermedadRenalCronica has 259707 (23.5%) missing valuesMissing
Inmunosupresion has 259707 (23.5%) missing valuesMissing
AlcoholismoCronico has 259707 (23.5%) missing valuesMissing
EnfermedadNeurologicaCronica has 259707 (23.5%) missing valuesMissing
Tabaquismo has 259707 (23.5%) missing valuesMissing
Embarazo has 259707 (23.5%) missing valuesMissing
SemanasGestacion has 259707 (23.5%) missing valuesMissing
OrdenLaboratorioDetalleId has unique valuesUnique
SemanasGestacion has 789118 (71.4%) zerosZeros

Reproduction

Analysis started2023-07-09 15:20:12.087483
Analysis finished2023-07-09 15:27:40.370784
Duration7 minutes and 28.28 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

OrdenLaboratorioDetalleId
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1104972
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean589056.72
Minimum12812
Maximum1164874
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:40.695102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12812
5-th percentile68880.55
Q1302505.75
median588366.5
Q3877376.25
95-th percentile1107741.4
Maximum1164874
Range1152062
Interquartile range (IQR)574870.5

Descriptive statistics

Standard deviation332260.93
Coefficient of variation (CV)0.56405593
Kurtosis-1.1948794
Mean589056.72
Median Absolute Deviation (MAD)287395
Skewness-0.001033928
Sum6.5089118 × 1011
Variance1.1039733 × 1011
MonotonicityNot monotonic
2023-07-09T09:27:40.829021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40669 1
 
< 0.1%
923895 1
 
< 0.1%
541129 1
 
< 0.1%
531305 1
 
< 0.1%
310444 1
 
< 0.1%
531306 1
 
< 0.1%
1021534 1
 
< 0.1%
1021900 1
 
< 0.1%
318947 1
 
< 0.1%
246035 1
 
< 0.1%
Other values (1104962) 1104962
> 99.9%
ValueCountFrequency (%)
12812 1
< 0.1%
12813 1
< 0.1%
12814 1
< 0.1%
12815 1
< 0.1%
12816 1
< 0.1%
12817 1
< 0.1%
12818 1
< 0.1%
12819 1
< 0.1%
12820 1
< 0.1%
12821 1
< 0.1%
ValueCountFrequency (%)
1164874 1
< 0.1%
1164873 1
< 0.1%
1164872 1
< 0.1%
1164871 1
< 0.1%
1164870 1
< 0.1%
1164869 1
< 0.1%
1164868 1
< 0.1%
1164867 1
< 0.1%
1164866 1
< 0.1%
1164865 1
< 0.1%
Distinct925784
Distinct (%)83.8%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
Minimum2020-11-16 10:25:15.310000
Maximum2023-07-04 17:05:36.927000
2023-07-09T09:27:40.961774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:41.086585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1104648
Distinct (%)> 99.9%
Missing3
Missing (%)< 0.1%
Memory size8.4 MiB
2023-07-09T09:27:41.692215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.805562
Min length3

Characters and Unicode

Total characters9729873
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1104327 ?
Unique (%)99.9%

Sample

1st rowLNV 210778
2nd rowLNV 219064
3rd rowLNV 263159
4th rowLRA 14736
5th rowLNV 359736
ValueCountFrequency (%)
m 451257
20.4%
lnv 354497
 
16.0%
lrc 123026
 
5.6%
mlp 66579
 
3.0%
lra 56520
 
2.6%
lro 53090
 
2.4%
47686 5
 
< 0.1%
5846 5
 
< 0.1%
15893 5
 
< 0.1%
46875 5
 
< 0.1%
Other values (536144) 1104949
50.0%
2023-07-09T09:27:42.314496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1104969
 
11.4%
2 769307
 
7.9%
3 760729
 
7.8%
4 729651
 
7.5%
1 720013
 
7.4%
L 653712
 
6.7%
5 587329
 
6.0%
6 537289
 
5.5%
7 529571
 
5.4%
9 526541
 
5.4%
Other values (10) 2810762
28.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6212511
63.8%
Uppercase Letter 2412393
 
24.8%
Space Separator 1104969
 
11.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 769307
12.4%
3 760729
12.2%
4 729651
11.7%
1 720013
11.6%
5 587329
9.5%
6 537289
8.6%
7 529571
8.5%
9 526541
8.5%
8 526287
8.5%
0 525794
8.5%
Uppercase Letter
ValueCountFrequency (%)
L 653712
27.1%
M 517836
21.5%
V 354497
14.7%
N 354497
14.7%
R 232636
 
9.6%
C 123026
 
5.1%
P 66579
 
2.8%
A 56520
 
2.3%
O 53090
 
2.2%
Space Separator
ValueCountFrequency (%)
1104969
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7317480
75.2%
Latin 2412393
 
24.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1104969
15.1%
2 769307
10.5%
3 760729
10.4%
4 729651
10.0%
1 720013
9.8%
5 587329
8.0%
6 537289
7.3%
7 529571
7.2%
9 526541
7.2%
8 526287
7.2%
Latin
ValueCountFrequency (%)
L 653712
27.1%
M 517836
21.5%
V 354497
14.7%
N 354497
14.7%
R 232636
 
9.6%
C 123026
 
5.1%
P 66579
 
2.8%
A 56520
 
2.3%
O 53090
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9729873
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1104969
 
11.4%
2 769307
 
7.9%
3 760729
 
7.8%
4 729651
 
7.5%
1 720013
 
7.4%
L 653712
 
6.7%
5 587329
 
6.0%
6 537289
 
5.5%
7 529571
 
5.4%
9 526541
 
5.4%
Other values (10) 2810762
28.9%
Distinct839393
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:42.904188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.080935
Min length0

Characters and Unicode

Total characters13349095
Distinct characters43
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique665958 ?
Unique (%)60.3%

Sample

1st row0801197409314
2nd row0801197200856
3rd rowP-17519
4th row0209197901312
5th row1709198200981
ValueCountFrequency (%)
0000000000000 125
 
< 0.1%
0 83
 
< 0.1%
0801 82
 
< 0.1%
0408198800166 51
 
< 0.1%
0401199601223 39
 
< 0.1%
0418199400152 38
 
< 0.1%
1503197700353 37
 
< 0.1%
0401200001061 37
 
< 0.1%
0418199600113 32
 
< 0.1%
0704198300271 31
 
< 0.1%
Other values (839445) 1103723
99.9%
2023-07-09T09:27:43.621616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3671705
27.5%
1 2486249
18.6%
9 1441570
 
10.8%
8 1013576
 
7.6%
2 961155
 
7.2%
7 719266
 
5.4%
6 716347
 
5.4%
5 712288
 
5.3%
3 658467
 
4.9%
4 607422
 
4.6%
Other values (33) 361050
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12988045
97.3%
Uppercase Letter 182920
 
1.4%
Dash Punctuation 178003
 
1.3%
Space Separator 110
 
< 0.1%
Other Punctuation 16
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 177697
97.1%
G 1206
 
0.7%
F 1037
 
0.6%
A 645
 
0.4%
E 625
 
0.3%
C 352
 
0.2%
B 172
 
0.1%
D 144
 
0.1%
R 105
 
0.1%
Y 102
 
0.1%
Other values (17) 835
 
0.5%
Decimal Number
ValueCountFrequency (%)
0 3671705
28.3%
1 2486249
19.1%
9 1441570
 
11.1%
8 1013576
 
7.8%
2 961155
 
7.4%
7 719266
 
5.5%
6 716347
 
5.5%
5 712288
 
5.5%
3 658467
 
5.1%
4 607422
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 12
75.0%
/ 3
 
18.8%
' 1
 
6.2%
Dash Punctuation
ValueCountFrequency (%)
- 178003
100.0%
Space Separator
ValueCountFrequency (%)
110
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13166175
98.6%
Latin 182920
 
1.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 177697
97.1%
G 1206
 
0.7%
F 1037
 
0.6%
A 645
 
0.4%
E 625
 
0.3%
C 352
 
0.2%
B 172
 
0.1%
D 144
 
0.1%
R 105
 
0.1%
Y 102
 
0.1%
Other values (17) 835
 
0.5%
Common
ValueCountFrequency (%)
0 3671705
27.9%
1 2486249
18.9%
9 1441570
 
10.9%
8 1013576
 
7.7%
2 961155
 
7.3%
7 719266
 
5.5%
6 716347
 
5.4%
5 712288
 
5.4%
3 658467
 
5.0%
4 607422
 
4.6%
Other values (6) 178130
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13349089
> 99.9%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3671705
27.5%
1 2486249
18.6%
9 1441570
 
10.8%
8 1013576
 
7.6%
2 961155
 
7.2%
7 719266
 
5.4%
6 716347
 
5.4%
5 712288
 
5.3%
3 658467
 
4.9%
4 607422
 
4.6%
Other values (32) 361044
 
2.7%
None
ValueCountFrequency (%)
Ñ 6
100.0%
Distinct790614
Distinct (%)71.6%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:43.908942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length48
Mean length26.21557
Min length5

Characters and Unicode

Total characters28967471
Distinct characters78
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique594369 ?
Unique (%)53.8%

Sample

1st rowCARLOS ROBERTO MEDINA ACOSTA
2nd rowRICARDO FRANCISCO GONZALEZ MEJIA
3rd rowWILLIAM JAMES LORENZ
4th rowCARLOS ALFREDO MIRANDA SABIO
5th rowDOUGLAS REINALDO ZERON JUAREZ
ValueCountFrequency (%)
maria 61623
 
1.5%
martinez 55694
 
1.4%
hernandez 54367
 
1.3%
lopez 52202
 
1.3%
jose 51500
 
1.3%
rodriguez 49683
 
1.2%
garcia 39593
 
1.0%
flores 35414
 
0.9%
mejia 34339
 
0.8%
cruz 31522
 
0.8%
Other values (81944) 3649023
88.7%
2023-07-09T09:27:44.292230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 4119026
14.2%
3013239
10.4%
E 2785544
9.6%
R 2338331
 
8.1%
I 2022198
 
7.0%
O 1884944
 
6.5%
N 1801839
 
6.2%
L 1754416
 
6.1%
S 1319482
 
4.6%
D 995735
 
3.4%
Other values (68) 6932717
23.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 25949820
89.6%
Space Separator 3013239
 
10.4%
Other Punctuation 2817
 
< 0.1%
Dash Punctuation 746
 
< 0.1%
Decimal Number 420
 
< 0.1%
Open Punctuation 158
 
< 0.1%
Close Punctuation 153
 
< 0.1%
Math Symbol 88
 
< 0.1%
Modifier Symbol 17
 
< 0.1%
Other Letter 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 4119026
15.9%
E 2785544
10.7%
R 2338331
 
9.0%
I 2022198
 
7.8%
O 1884944
 
7.3%
N 1801839
 
6.9%
L 1754416
 
6.8%
S 1319482
 
5.1%
D 995735
 
3.8%
M 889313
 
3.4%
Other values (29) 6038992
23.3%
Other Punctuation
ValueCountFrequency (%)
. 2596
92.2%
' 72
 
2.6%
/ 69
 
2.4%
, 38
 
1.3%
¡ 15
 
0.5%
# 11
 
0.4%
; 9
 
0.3%
¿ 3
 
0.1%
? 1
 
< 0.1%
" 1
 
< 0.1%
Other values (2) 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 122
29.0%
2 72
17.1%
1 66
15.7%
9 32
 
7.6%
3 32
 
7.6%
5 28
 
6.7%
8 20
 
4.8%
7 19
 
4.5%
6 17
 
4.0%
4 12
 
2.9%
Math Symbol
ValueCountFrequency (%)
| 85
96.6%
+ 1
 
1.1%
< 1
 
1.1%
= 1
 
1.1%
Open Punctuation
ValueCountFrequency (%)
[ 142
89.9%
( 15
 
9.5%
{ 1
 
0.6%
Close Punctuation
ValueCountFrequency (%)
] 136
88.9%
) 15
 
9.8%
} 2
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 745
99.9%
1
 
0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 13
76.5%
` 4
 
23.5%
Space Separator
ValueCountFrequency (%)
3013239
100.0%
Other Letter
ValueCountFrequency (%)
º 7
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25949827
89.6%
Common 3017644
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 4119026
15.9%
E 2785544
10.7%
R 2338331
 
9.0%
I 2022198
 
7.8%
O 1884944
 
7.3%
N 1801839
 
6.9%
L 1754416
 
6.8%
S 1319482
 
5.1%
D 995735
 
3.8%
M 889313
 
3.4%
Other values (30) 6038999
23.3%
Common
ValueCountFrequency (%)
3013239
99.9%
. 2596
 
0.1%
- 745
 
< 0.1%
[ 142
 
< 0.1%
] 136
 
< 0.1%
0 122
 
< 0.1%
| 85
 
< 0.1%
' 72
 
< 0.1%
2 72
 
< 0.1%
/ 69
 
< 0.1%
Other values (28) 366
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28917674
99.8%
None 49796
 
0.2%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 4119026
14.2%
3013239
10.4%
E 2785544
9.6%
R 2338331
 
8.1%
I 2022198
 
7.0%
O 1884944
 
6.5%
N 1801839
 
6.2%
L 1754416
 
6.1%
S 1319482
 
4.6%
D 995735
 
3.4%
Other values (50) 6882920
23.8%
None
ValueCountFrequency (%)
Ñ 45177
90.7%
Í 1800
 
3.6%
Á 1009
 
2.0%
É 816
 
1.6%
Ó 727
 
1.5%
Ú 200
 
0.4%
¡ 15
 
< 0.1%
´ 13
 
< 0.1%
Ü 8
 
< 0.1%
Ð 8
 
< 0.1%
Other values (7) 23
 
< 0.1%
Punctuation
ValueCountFrequency (%)
1
100.0%

Edad
Real number (ℝ)

Distinct120
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.793797
Minimum1
Maximum518
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:44.432220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7
Q123
median33
Q348
95-th percentile70
Maximum518
Range517
Interquartile range (IQR)25

Descriptive statistics

Standard deviation18.727488
Coefficient of variation (CV)0.52320486
Kurtosis3.4159761
Mean35.793797
Median Absolute Deviation (MAD)12
Skewness0.62874371
Sum39551143
Variance350.71881
MonotonicityNot monotonic
2023-07-09T09:27:44.557937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25 28483
 
2.6%
26 28101
 
2.5%
24 28022
 
2.5%
27 27894
 
2.5%
23 27567
 
2.5%
22 27237
 
2.5%
30 26463
 
2.4%
29 26383
 
2.4%
28 26233
 
2.4%
21 26118
 
2.4%
Other values (110) 832471
75.3%
ValueCountFrequency (%)
1 12013
1.1%
2 9103
0.8%
3 7641
0.7%
4 7462
0.7%
5 7477
0.7%
6 7914
0.7%
7 7657
0.7%
8 7456
0.7%
9 8040
0.7%
10 8387
0.8%
ValueCountFrequency (%)
518 9
 
< 0.1%
120 156
< 0.1%
119 6
 
< 0.1%
118 2
 
< 0.1%
117 1
 
< 0.1%
116 2
 
< 0.1%
114 1
 
< 0.1%
113 4
 
< 0.1%
112 3
 
< 0.1%
111 6
 
< 0.1%

TipoEdad
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
AÑOS
1087086 
MESES
 
14591
DIAS
 
2988
HORAS
 
307

Length

Max length5
Median length4
Mean length4.0134827
Min length4

Characters and Unicode

Total characters4434786
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAÑOS
2nd rowAÑOS
3rd rowAÑOS
4th rowAÑOS
5th rowAÑOS

Common Values

ValueCountFrequency (%)
AÑOS 1087086
98.4%
MESES 14591
 
1.3%
DIAS 2988
 
0.3%
HORAS 307
 
< 0.1%

Length

2023-07-09T09:27:44.679682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:44.786274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
años 1087086
98.4%
meses 14591
 
1.3%
dias 2988
 
0.3%
horas 307
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
S 1119563
25.2%
A 1090381
24.6%
O 1087393
24.5%
Ñ 1087086
24.5%
E 29182
 
0.7%
M 14591
 
0.3%
D 2988
 
0.1%
I 2988
 
0.1%
H 307
 
< 0.1%
R 307
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4434786
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1119563
25.2%
A 1090381
24.6%
O 1087393
24.5%
Ñ 1087086
24.5%
E 29182
 
0.7%
M 14591
 
0.3%
D 2988
 
0.1%
I 2988
 
0.1%
H 307
 
< 0.1%
R 307
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4434786
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1119563
25.2%
A 1090381
24.6%
O 1087393
24.5%
Ñ 1087086
24.5%
E 29182
 
0.7%
M 14591
 
0.3%
D 2988
 
0.1%
I 2988
 
0.1%
H 307
 
< 0.1%
R 307
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3347700
75.5%
None 1087086
 
24.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1119563
33.4%
A 1090381
32.6%
O 1087393
32.5%
E 29182
 
0.9%
M 14591
 
0.4%
D 2988
 
0.1%
I 2988
 
0.1%
H 307
 
< 0.1%
R 307
 
< 0.1%
None
ValueCountFrequency (%)
Ñ 1087086
100.0%

Sexo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1024
Missing (%)0.1%
Memory size8.4 MiB
MUJER
636037 
HOMBRE
467911 

Length

Max length6
Median length5
Mean length5.4238524
Min length5

Characters and Unicode

Total characters5987651
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHOMBRE
2nd rowHOMBRE
3rd rowHOMBRE
4th rowHOMBRE
5th rowHOMBRE

Common Values

ValueCountFrequency (%)
MUJER 636037
57.6%
HOMBRE 467911
42.3%
(Missing) 1024
 
0.1%

Length

2023-07-09T09:27:44.899313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:44.991029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
mujer 636037
57.6%
hombre 467911
42.4%

Most occurring characters

ValueCountFrequency (%)
M 1103948
18.4%
E 1103948
18.4%
R 1103948
18.4%
U 636037
10.6%
J 636037
10.6%
H 467911
7.8%
O 467911
7.8%
B 467911
7.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5987651
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1103948
18.4%
E 1103948
18.4%
R 1103948
18.4%
U 636037
10.6%
J 636037
10.6%
H 467911
7.8%
O 467911
7.8%
B 467911
7.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 5987651
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1103948
18.4%
E 1103948
18.4%
R 1103948
18.4%
U 636037
10.6%
J 636037
10.6%
H 467911
7.8%
O 467911
7.8%
B 467911
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5987651
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1103948
18.4%
E 1103948
18.4%
R 1103948
18.4%
U 636037
10.6%
J 636037
10.6%
H 467911
7.8%
O 467911
7.8%
B 467911
7.8%
Distinct671
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:45.161368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length32
Mean length8.4474937
Min length2

Characters and Unicode

Total characters9334244
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)< 0.1%

Sample

1st rowINGENIERO ELECTRICISTA
2nd rowINGENIERO ELECTRICISTA
3rd rowINGENIERO ELECTRICISTA
4th rowINGENIERO ELECTRICISTA
5th rowINGENIERO CIVIL
ValueCountFrequency (%)
de 262020
15.4%
sd 249697
14.7%
casa 213505
12.6%
ama 213505
12.6%
estudiante 129091
 
7.6%
otro 65827
 
3.9%
comerciante 38321
 
2.3%
agricultor 31888
 
1.9%
maestro 16420
 
1.0%
medico 14738
 
0.9%
Other values (685) 461523
27.2%
2023-07-09T09:27:45.529761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1487783
15.9%
E 1032366
11.1%
D 817230
8.8%
S 747453
8.0%
O 640314
 
6.9%
I 629534
 
6.7%
T 614364
 
6.6%
591563
 
6.3%
R 579450
 
6.2%
C 555503
 
6.0%
Other values (17) 1638684
17.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8742665
93.7%
Space Separator 591563
 
6.3%
Decimal Number 16
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1487783
17.0%
E 1032366
11.8%
D 817230
9.3%
S 747453
8.5%
O 640314
7.3%
I 629534
7.2%
T 614364
7.0%
R 579450
 
6.6%
C 555503
 
6.4%
M 423989
 
4.8%
Other values (15) 1214679
13.9%
Space Separator
ValueCountFrequency (%)
591563
100.0%
Decimal Number
ValueCountFrequency (%)
0 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8742665
93.7%
Common 591579
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1487783
17.0%
E 1032366
11.8%
D 817230
9.3%
S 747453
8.5%
O 640314
7.3%
I 629534
7.2%
T 614364
7.0%
R 579450
 
6.6%
C 555503
 
6.4%
M 423989
 
4.8%
Other values (15) 1214679
13.9%
Common
ValueCountFrequency (%)
591563
> 99.9%
0 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9321908
99.9%
None 12336
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1487783
16.0%
E 1032366
11.1%
D 817230
8.8%
S 747453
8.0%
O 640314
 
6.9%
I 629534
 
6.8%
T 614364
 
6.6%
591563
 
6.3%
R 579450
 
6.2%
C 555503
 
6.0%
Other values (16) 1626348
17.4%
None
ValueCountFrequency (%)
Ñ 12336
100.0%

Region
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.986597
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:45.659717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q16
median17
Q319
95-th percentile20
Maximum20
Range19
Interquartile range (IQR)13

Descriptive statistics

Standard deviation6.797058
Coefficient of variation (CV)0.52339023
Kurtosis-1.4742553
Mean12.986597
Median Absolute Deviation (MAD)3
Skewness-0.44379807
Sum14349826
Variance46.199998
MonotonicityNot monotonic
2023-07-09T09:27:45.766554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
19 386852
35.0%
20 103294
 
9.3%
5 83060
 
7.5%
6 69644
 
6.3%
1 61380
 
5.6%
7 58090
 
5.3%
18 56020
 
5.1%
8 41227
 
3.7%
4 33842
 
3.1%
15 33193
 
3.0%
Other values (10) 178370
16.1%
ValueCountFrequency (%)
1 61380
5.6%
2 27813
 
2.5%
3 30456
 
2.8%
4 33842
3.1%
5 83060
7.5%
6 69644
6.3%
7 58090
5.3%
8 41227
3.7%
9 2952
 
0.3%
10 24817
 
2.2%
ValueCountFrequency (%)
20 103294
 
9.3%
19 386852
35.0%
18 56020
 
5.1%
17 8683
 
0.8%
16 18099
 
1.6%
15 33193
 
3.0%
14 13789
 
1.2%
13 6733
 
0.6%
12 26335
 
2.4%
11 18693
 
1.7%

Departamento
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
FRANCISCO MORAZAN
428079 
CORTES
186354 
CHOLUTECA
69644 
ATLANTIDA
61380 
EL PARAISO
58090 
Other values (13)
301425 

Length

Max length17
Median length14
Mean length11.198267
Min length4

Characters and Unicode

Total characters12373771
Distinct characters23
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFRANCISCO MORAZAN
2nd rowFRANCISCO MORAZAN
3rd rowCOLON
4th rowCOLON
5th rowVALLE

Common Values

ValueCountFrequency (%)
FRANCISCO MORAZAN 428079
38.7%
CORTES 186354
16.9%
CHOLUTECA 69644
 
6.3%
ATLANTIDA 61380
 
5.6%
EL PARAISO 58090
 
5.3%
YORO 56020
 
5.1%
COPAN 33842
 
3.1%
OLANCHO 33193
 
3.0%
COMAYAGUA 30456
 
2.8%
COLON 27813
 
2.5%
Other values (8) 120101
 
10.9%

Length

2023-07-09T09:27:45.889391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
francisco 428079
25.2%
morazan 428079
25.2%
cortes 186354
11.0%
choluteca 69644
 
4.1%
atlantida 61380
 
3.6%
el 58090
 
3.4%
paraiso 58090
 
3.4%
yoro 56020
 
3.3%
la 45028
 
2.7%
copan 33842
 
2.0%
Other values (16) 272952
16.1%

Most occurring characters

ValueCountFrequency (%)
A 2079630
16.8%
O 1499126
12.1%
C 1348662
10.9%
R 1202505
9.7%
N 1055302
8.5%
S 733912
 
5.9%
I 647206
 
5.2%
592586
 
4.8%
M 465268
 
3.8%
Z 454414
 
3.7%
Other values (13) 2295160
18.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11781185
95.2%
Space Separator 592586
 
4.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2079630
17.7%
O 1499126
12.7%
C 1348662
11.4%
R 1202505
10.2%
N 1055302
9.0%
S 733912
 
6.2%
I 647206
 
5.5%
M 465268
 
3.9%
Z 454414
 
3.9%
T 435463
 
3.7%
Other values (12) 1859697
15.8%
Space Separator
ValueCountFrequency (%)
592586
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11781185
95.2%
Common 592586
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2079630
17.7%
O 1499126
12.7%
C 1348662
11.4%
R 1202505
10.2%
N 1055302
9.0%
S 733912
 
6.2%
I 647206
 
5.5%
M 465268
 
3.9%
Z 454414
 
3.9%
T 435463
 
3.7%
Other values (12) 1859697
15.8%
Common
ValueCountFrequency (%)
592586
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12373771
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2079630
16.8%
O 1499126
12.1%
C 1348662
10.9%
R 1202505
9.7%
N 1055302
8.5%
S 733912
 
5.9%
I 647206
 
5.2%
592586
 
4.8%
M 465268
 
3.8%
Z 454414
 
3.7%
Other values (13) 2295160
18.5%
Distinct281
Distinct (%)< 0.1%
Missing415
Missing (%)< 0.1%
Memory size8.4 MiB
2023-07-09T09:27:46.108958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length12.289611
Min length4

Characters and Unicode

Total characters13574576
Distinct characters31
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDISTRITO CENTRAL
2nd rowDISTRITO CENTRAL
3rd rowTRUJILLO
4th rowTRUJILLO
5th rowSAN LORENZO
ValueCountFrequency (%)
distrito 383925
18.4%
central 383925
18.4%
san 142043
 
6.8%
pedro 104270
 
5.0%
sula 102551
 
4.9%
la 85016
 
4.1%
de 66079
 
3.2%
choluteca 65980
 
3.2%
santa 52804
 
2.5%
ceiba 50347
 
2.4%
Other values (289) 649081
31.1%
2023-07-09T09:27:46.456908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1610992
11.9%
T 1490929
11.0%
R 1189089
8.8%
I 1062065
7.8%
O 1035830
 
7.6%
E 985866
 
7.3%
981857
 
7.2%
L 905828
 
6.7%
S 884865
 
6.5%
N 801012
 
5.9%
Other values (21) 2626243
19.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12592719
92.8%
Space Separator 981857
 
7.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1610992
12.8%
T 1490929
11.8%
R 1189089
9.4%
I 1062065
8.4%
O 1035830
8.2%
E 985866
7.8%
L 905828
7.2%
S 884865
7.0%
N 801012
6.4%
C 797709
6.3%
Other values (20) 1828534
14.5%
Space Separator
ValueCountFrequency (%)
981857
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12592719
92.8%
Common 981857
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1610992
12.8%
T 1490929
11.8%
R 1189089
9.4%
I 1062065
8.4%
O 1035830
8.2%
E 985866
7.8%
L 905828
7.2%
S 884865
7.0%
N 801012
6.4%
C 797709
6.3%
Other values (20) 1828534
14.5%
Common
ValueCountFrequency (%)
981857
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13573846
> 99.9%
None 730
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1610992
11.9%
T 1490929
11.0%
R 1189089
8.8%
I 1062065
7.8%
O 1035830
 
7.6%
E 985866
 
7.3%
981857
 
7.2%
L 905828
 
6.7%
S 884865
 
6.5%
N 801012
 
5.9%
Other values (17) 2625513
19.3%
None
ValueCountFrequency (%)
Ñ 585
80.1%
Í 81
 
11.1%
Ó 53
 
7.3%
Á 11
 
1.5%

DepartamentoResidencia
Categorical

HIGH CORRELATION  MISSING 

Distinct20
Distinct (%)< 0.1%
Missing202752
Missing (%)18.3%
Memory size8.4 MiB
METROPOLITANA DEL DISTRITO CENTRAL
317728 
METROPOLITANA DE SAN PEDRO SULA
75785 
CORTES
68046 
CHOLUTECA
67561 
ATLANTIDA
51601 
Other values (15)
321499 

Length

Max length34
Median length31
Mean length19.357165
Min length4

Characters and Unicode

Total characters17464421
Distinct characters23
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCOLON
2nd rowATLANTIDA
3rd rowINTIBUCA
4th rowMETROPOLITANA DEL DISTRITO CENTRAL
5th rowMETROPOLITANA DEL DISTRITO CENTRAL

Common Values

ValueCountFrequency (%)
METROPOLITANA DEL DISTRITO CENTRAL 317728
28.8%
METROPOLITANA DE SAN PEDRO SULA 75785
 
6.9%
CORTES 68046
 
6.2%
CHOLUTECA 67561
 
6.1%
ATLANTIDA 51601
 
4.7%
YORO 51482
 
4.7%
EL PARAISO 35945
 
3.3%
FRANCISCO MORAZAN 33598
 
3.0%
COPAN 27870
 
2.5%
OLANCHO 27142
 
2.5%
Other values (10) 145462
13.2%
(Missing) 202752
18.3%

Length

2023-07-09T09:27:46.605744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
metropolitana 393513
16.9%
central 317728
13.7%
del 317728
13.7%
distrito 317728
13.7%
de 94254
 
4.1%
san 75785
 
3.3%
pedro 75785
 
3.3%
sula 75785
 
3.3%
cortes 68046
 
2.9%
choluteca 67561
 
2.9%
Other values (23) 518216
22.3%

Most occurring characters

ValueCountFrequency (%)
T 2028605
11.6%
A 2025526
11.6%
O 1694152
9.7%
1419909
8.1%
E 1412489
8.1%
L 1378734
7.9%
R 1374416
7.9%
I 1239155
7.1%
N 1021502
 
5.8%
D 859099
 
4.9%
Other values (13) 3010834
17.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16044512
91.9%
Space Separator 1419909
 
8.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 2028605
12.6%
A 2025526
12.6%
O 1694152
10.6%
E 1412489
8.8%
L 1378734
8.6%
R 1374416
8.6%
I 1239155
7.7%
N 1021502
6.4%
D 859099
 
5.4%
C 722885
 
4.5%
Other values (12) 2287949
14.3%
Space Separator
ValueCountFrequency (%)
1419909
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16044512
91.9%
Common 1419909
 
8.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 2028605
12.6%
A 2025526
12.6%
O 1694152
10.6%
E 1412489
8.8%
L 1378734
8.6%
R 1374416
8.6%
I 1239155
7.7%
N 1021502
6.4%
D 859099
 
5.4%
C 722885
 
4.5%
Other values (12) 2287949
14.3%
Common
ValueCountFrequency (%)
1419909
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17464421
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 2028605
11.6%
A 2025526
11.6%
O 1694152
9.7%
1419909
8.1%
E 1412489
8.1%
L 1378734
7.9%
R 1374416
7.9%
I 1239155
7.1%
N 1021502
 
5.8%
D 859099
 
4.9%
Other values (13) 3010834
17.2%

MunicipioResidencia
Text

MISSING 

Distinct287
Distinct (%)< 0.1%
Missing202763
Missing (%)18.4%
Memory size8.4 MiB
2023-07-09T09:27:46.812093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length12.343118
Min length4

Characters and Unicode

Total characters11136072
Distinct characters30
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowTRUJILLO
2nd rowLA CEIBA
3rd rowINTIBUCA
4th rowDISTRITO CENTRAL
5th rowDISTRITO CENTRAL
ValueCountFrequency (%)
distrito 315554
18.6%
central 315554
18.6%
san 108370
 
6.4%
pedro 75650
 
4.5%
sula 74242
 
4.4%
la 63525
 
3.7%
de 52852
 
3.1%
choluteca 51146
 
3.0%
el 44781
 
2.6%
santa 43282
 
2.6%
Other values (295) 549766
32.4%
2023-07-09T09:27:47.156197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1306495
11.7%
T 1226083
11.0%
R 988387
8.9%
I 878370
7.9%
O 855915
 
7.7%
E 813976
 
7.3%
793038
 
7.1%
L 736486
 
6.6%
S 713958
 
6.4%
C 659527
 
5.9%
Other values (20) 2163837
19.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10343034
92.9%
Space Separator 793038
 
7.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1306495
12.6%
T 1226083
11.9%
R 988387
9.6%
I 878370
8.5%
O 855915
8.3%
E 813976
7.9%
L 736486
7.1%
S 713958
6.9%
C 659527
6.4%
N 656318
6.3%
Other values (19) 1507519
14.6%
Space Separator
ValueCountFrequency (%)
793038
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10343034
92.9%
Common 793038
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1306495
12.6%
T 1226083
11.9%
R 988387
9.6%
I 878370
8.5%
O 855915
8.3%
E 813976
7.9%
L 736486
7.1%
S 713958
6.9%
C 659527
6.4%
N 656318
6.3%
Other values (19) 1507519
14.6%
Common
ValueCountFrequency (%)
793038
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11135221
> 99.9%
None 851
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1306495
11.7%
T 1226083
11.0%
R 988387
8.9%
I 878370
7.9%
O 855915
 
7.7%
E 813976
 
7.3%
793038
 
7.1%
L 736486
 
6.6%
S 713958
 
6.4%
C 659527
 
5.9%
Other values (17) 2162986
19.4%
None
ValueCountFrequency (%)
Í 381
44.8%
Ñ 367
43.1%
Ó 103
 
12.1%
Distinct139144
Distinct (%)12.6%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:47.413358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length368
Median length131
Mean length14.591852
Min length0

Characters and Unicode

Total characters16123588
Distinct characters89
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78995 ?
Unique (%)7.1%

Sample

1st rowLOMA LINDA
2nd rowLUIS LANDA
3rd rowJERICO
4th rowCAPIRO
5th rowEL CENTRO SAN LORENZO, VALLE
ValueCountFrequency (%)
col 191525
 
6.7%
el 104242
 
3.7%
la 96410
 
3.4%
de 90010
 
3.2%
san 81349
 
2.9%
barrio 57246
 
2.0%
sd 55243
 
1.9%
colonia 50759
 
1.8%
las 42837
 
1.5%
casa 37370
 
1.3%
Other values (44592) 2039184
71.6%
2023-07-09T09:27:47.822640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2167977
13.4%
1747100
10.8%
O 1318900
 
8.2%
L 1295652
 
8.0%
E 1292152
 
8.0%
R 987092
 
6.1%
S 929440
 
5.8%
I 853276
 
5.3%
N 834597
 
5.2%
C 819262
 
5.1%
Other values (79) 3878140
24.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 13713974
85.1%
Space Separator 1747100
 
10.8%
Other Punctuation 365545
 
2.3%
Decimal Number 254771
 
1.6%
Dash Punctuation 24277
 
0.2%
Other Symbol 12351
 
0.1%
Other Letter 2701
 
< 0.1%
Control 1363
 
< 0.1%
Open Punctuation 544
 
< 0.1%
Close Punctuation 538
 
< 0.1%
Other values (7) 424
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2167977
15.8%
O 1318900
9.6%
L 1295652
9.4%
E 1292152
9.4%
R 987092
 
7.2%
S 929440
 
6.8%
I 853276
 
6.2%
N 834597
 
6.1%
C 819262
 
6.0%
D 513511
 
3.7%
Other values (30) 2702115
19.7%
Other Punctuation
ValueCountFrequency (%)
. 238993
65.4%
, 101912
27.9%
/ 10897
 
3.0%
# 10481
 
2.9%
; 1314
 
0.4%
: 888
 
0.2%
" 782
 
0.2%
* 202
 
0.1%
' 21
 
< 0.1%
¡ 15
 
< 0.1%
Other values (5) 40
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 53376
21.0%
2 45263
17.8%
3 30910
12.1%
4 21330
 
8.4%
0 19774
 
7.8%
5 19494
 
7.7%
9 18552
 
7.3%
8 15800
 
6.2%
6 15715
 
6.2%
7 14557
 
5.7%
Open Punctuation
ValueCountFrequency (%)
( 530
97.4%
[ 8
 
1.5%
{ 6
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 527
98.0%
} 6
 
1.1%
] 5
 
0.9%
Modifier Symbol
ValueCountFrequency (%)
´ 81
85.3%
` 12
 
12.6%
¨ 2
 
2.1%
Other Letter
ValueCountFrequency (%)
º 2048
75.8%
ª 653
 
24.2%
Control
ValueCountFrequency (%)
795
58.3%
568
41.7%
Math Symbol
ValueCountFrequency (%)
| 285
96.9%
+ 9
 
3.1%
Other Number
ValueCountFrequency (%)
½ 4
80.0%
1
 
20.0%
Space Separator
ValueCountFrequency (%)
1747100
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 24277
100.0%
Other Symbol
ValueCountFrequency (%)
° 12351
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14
100.0%
Final Punctuation
ValueCountFrequency (%)
7
100.0%
Initial Punctuation
ValueCountFrequency (%)
7
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13716675
85.1%
Common 2406913
 
14.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1747100
72.6%
. 238993
 
9.9%
, 101912
 
4.2%
1 53376
 
2.2%
2 45263
 
1.9%
3 30910
 
1.3%
- 24277
 
1.0%
4 21330
 
0.9%
0 19774
 
0.8%
5 19494
 
0.8%
Other values (37) 104484
 
4.3%
Latin
ValueCountFrequency (%)
A 2167977
15.8%
O 1318900
9.6%
L 1295652
9.4%
E 1292152
9.4%
R 987092
 
7.2%
S 929440
 
6.8%
I 853276
 
6.2%
N 834597
 
6.1%
C 819262
 
6.0%
D 513511
 
3.7%
Other values (32) 2704816
19.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16078430
99.7%
None 45144
 
0.3%
Punctuation 14
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2167977
13.5%
1747100
10.9%
O 1318900
 
8.2%
L 1295652
 
8.1%
E 1292152
 
8.0%
R 987092
 
6.1%
S 929440
 
5.8%
I 853276
 
5.3%
N 834597
 
5.2%
C 819262
 
5.1%
Other values (53) 3832982
23.8%
None
ValueCountFrequency (%)
Ñ 23174
51.3%
° 12351
27.4%
Í 2077
 
4.6%
º 2048
 
4.5%
Ó 1987
 
4.4%
É 1236
 
2.7%
Á 1170
 
2.6%
ª 653
 
1.4%
Ú 148
 
0.3%
Ü 120
 
0.3%
Other values (14) 180
 
0.4%
Punctuation
ValueCountFrequency (%)
7
50.0%
7
50.0%

Telefono
Text

MISSING 

Distinct364188
Distinct (%)40.4%
Missing202750
Missing (%)18.3%
Memory size8.4 MiB
2023-07-09T09:27:48.191615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length5.6300999
Min length0

Characters and Unicode

Total characters5079600
Distinct characters71
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique227976 ?
Unique (%)25.3%

Sample

1st row31455867
2nd row31696109
3rd row98844382
4th row9498-6036
5th row
ValueCountFrequency (%)
sd 15379
 
2.4%
00000000 8136
 
1.3%
no 3919
 
0.6%
tiene 2956
 
0.5%
nt 1477
 
0.2%
porta 822
 
0.1%
93761600 583
 
0.1%
np 540
 
0.1%
0 481
 
0.1%
0000000 408
 
0.1%
Other values (364375) 614499
94.7%
2023-07-09T09:27:48.659835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 873247
17.2%
8 561484
11.1%
3 538476
10.6%
7 460630
9.1%
0 448856
8.8%
5 433572
8.5%
6 431622
8.5%
2 427126
8.4%
4 400567
7.9%
1 397515
7.8%
Other values (61) 106505
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4973095
97.9%
Uppercase Letter 60299
 
1.2%
Dash Punctuation 36093
 
0.7%
Lowercase Letter 5158
 
0.1%
Space Separator 4660
 
0.1%
Other Punctuation 220
 
< 0.1%
Math Symbol 57
 
< 0.1%
Connector Punctuation 10
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Other Letter 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 15034
24.9%
D 14824
24.6%
N 8639
14.3%
E 5186
 
8.6%
T 5062
 
8.4%
O 4492
 
7.4%
I 2906
 
4.8%
P 1502
 
2.5%
A 1241
 
2.1%
R 818
 
1.4%
Other values (17) 595
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
n 878
17.0%
e 866
16.8%
d 788
15.3%
s 715
13.9%
o 598
11.6%
t 554
10.7%
i 459
8.9%
p 86
 
1.7%
a 84
 
1.6%
r 73
 
1.4%
Other values (7) 57
 
1.1%
Decimal Number
ValueCountFrequency (%)
9 873247
17.6%
8 561484
11.3%
3 538476
10.8%
7 460630
9.3%
0 448856
9.0%
5 433572
8.7%
6 431622
8.7%
2 427126
8.6%
4 400567
8.1%
1 397515
8.0%
Other Punctuation
ValueCountFrequency (%)
/ 94
42.7%
. 82
37.3%
* 33
 
15.0%
¡ 4
 
1.8%
' 4
 
1.8%
? 2
 
0.9%
¿ 1
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 36090
> 99.9%
3
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
| 43
75.4%
+ 14
 
24.6%
Space Separator
ValueCountFrequency (%)
4660
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10
100.0%
Close Punctuation
ValueCountFrequency (%)
} 4
100.0%
Other Letter
ValueCountFrequency (%)
º 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5014141
98.7%
Latin 65459
 
1.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 15034
23.0%
D 14824
22.6%
N 8639
13.2%
E 5186
 
7.9%
T 5062
 
7.7%
O 4492
 
6.9%
I 2906
 
4.4%
P 1502
 
2.3%
A 1241
 
1.9%
n 878
 
1.3%
Other values (35) 5695
 
8.7%
Common
ValueCountFrequency (%)
9 873247
17.4%
8 561484
11.2%
3 538476
10.7%
7 460630
9.2%
0 448856
9.0%
5 433572
8.6%
6 431622
8.6%
2 427126
8.5%
4 400567
8.0%
1 397515
7.9%
Other values (16) 41046
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5079587
> 99.9%
None 9
 
< 0.1%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 873247
17.2%
8 561484
11.1%
3 538476
10.6%
7 460630
9.1%
0 448856
8.8%
5 433572
8.5%
6 431622
8.5%
2 427126
8.4%
4 400567
7.9%
1 397515
7.8%
Other values (54) 106492
 
2.1%
None
ValueCountFrequency (%)
¡ 4
44.4%
º 2
22.2%
Ç 1
 
11.1%
Ñ 1
 
11.1%
¿ 1
 
11.1%
Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Distinct906
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:48.868484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length47
Mean length30.183729
Min length16

Characters and Unicode

Total characters33352175
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81 ?
Unique (%)< 0.1%

Sample

1st row(90040)(TJE) TRIAJE CCG
2nd row(90040)(TJE) TRIAJE CCG
3rd row(90059)(TJE) TRIAJE TRUJILLO
4th row(90059)(TJE) TRIAJE TRUJILLO
5th row(8982)(H.ARE) SAN LORENZO
ValueCountFrequency (%)
triaje 452540
 
11.1%
de 126070
 
3.1%
la 86386
 
2.1%
san 84387
 
2.1%
del 81819
 
2.0%
sps 62309
 
1.5%
choluteca 59723
 
1.5%
el 56748
 
1.4%
pri 55457
 
1.4%
sur 53665
 
1.3%
Other values (1620) 2962988
72.6%
2023-07-09T09:27:49.237089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2981901
 
8.9%
A 2888728
 
8.7%
E 2350353
 
7.0%
( 2302871
 
6.9%
) 2302871
 
6.9%
R 1662736
 
5.0%
I 1627003
 
4.9%
T 1545514
 
4.6%
0 1340809
 
4.0%
O 1283744
 
3.8%
Other values (60) 13065645
39.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 20211397
60.6%
Decimal Number 4976135
 
14.9%
Space Separator 2981901
 
8.9%
Open Punctuation 2302871
 
6.9%
Close Punctuation 2302871
 
6.9%
Other Punctuation 544272
 
1.6%
Dash Punctuation 18921
 
0.1%
Lowercase Letter 13807
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2888728
14.3%
E 2350353
11.6%
R 1662736
 
8.2%
I 1627003
 
8.0%
T 1545514
 
7.6%
O 1283744
 
6.4%
L 1257188
 
6.2%
S 1137936
 
5.6%
C 1091260
 
5.4%
J 1018786
 
5.0%
Other values (22) 4348149
21.5%
Lowercase Letter
ValueCountFrequency (%)
o 5202
37.7%
a 1782
 
12.9%
i 1425
 
10.3%
t 1090
 
7.9%
r 1020
 
7.4%
s 735
 
5.3%
l 556
 
4.0%
n 488
 
3.5%
e 352
 
2.5%
p 290
 
2.1%
Other values (11) 867
 
6.3%
Decimal Number
ValueCountFrequency (%)
0 1340809
26.9%
9 892821
17.9%
1 538670
10.8%
4 456211
 
9.2%
3 355782
 
7.1%
8 340793
 
6.8%
7 277456
 
5.6%
5 269105
 
5.4%
6 255382
 
5.1%
2 249106
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 468341
86.0%
, 74193
 
13.6%
/ 1738
 
0.3%
Space Separator
ValueCountFrequency (%)
2981901
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2302871
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2302871
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18921
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20225204
60.6%
Common 13126971
39.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2888728
14.3%
E 2350353
11.6%
R 1662736
 
8.2%
I 1627003
 
8.0%
T 1545514
 
7.6%
O 1283744
 
6.3%
L 1257188
 
6.2%
S 1137936
 
5.6%
C 1091260
 
5.4%
J 1018786
 
5.0%
Other values (43) 4361956
21.6%
Common
ValueCountFrequency (%)
2981901
22.7%
( 2302871
17.5%
) 2302871
17.5%
0 1340809
10.2%
9 892821
 
6.8%
1 538670
 
4.1%
. 468341
 
3.6%
4 456211
 
3.5%
3 355782
 
2.7%
8 340793
 
2.6%
Other values (7) 1145901
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33326340
99.9%
None 25835
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2981901
 
8.9%
A 2888728
 
8.7%
E 2350353
 
7.1%
( 2302871
 
6.9%
) 2302871
 
6.9%
R 1662736
 
5.0%
I 1627003
 
4.9%
T 1545514
 
4.6%
0 1340809
 
4.0%
O 1283744
 
3.9%
Other values (54) 13039810
39.1%
None
ValueCountFrequency (%)
Ñ 19873
76.9%
Á 5198
 
20.1%
Ó 519
 
2.0%
Ú 134
 
0.5%
Í 105
 
0.4%
É 6
 
< 0.1%

Prueba
Categorical

IMBALANCE 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
SARS CoV-2 (RT-PCR)
587610 
SARS CoV-2 (ANTIGENO-RDT)
514730 
INFLUENZA
 
2346
Mpox (qPCR)
 
137
ZIKA (ELISA)
 
99
Other values (2)
 
50

Length

Max length25
Median length19
Mean length21.771547
Min length6

Characters and Unicode

Total characters24056950
Distinct characters28
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSARS CoV-2 (RT-PCR)
2nd rowSARS CoV-2 (RT-PCR)
3rd rowSARS CoV-2 (RT-PCR)
4th rowSARS CoV-2 (RT-PCR)
5th rowSARS CoV-2 (RT-PCR)

Common Values

ValueCountFrequency (%)
SARS CoV-2 (RT-PCR) 587610
53.2%
SARS CoV-2 (ANTIGENO-RDT) 514730
46.6%
INFLUENZA 2346
 
0.2%
Mpox (qPCR) 137
 
< 0.1%
ZIKA (ELISA) 99
 
< 0.1%
DENGUE 49
 
< 0.1%
SIFILIS 1
 
< 0.1%

Length

2023-07-09T09:27:49.387050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:49.500626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
sars 1102340
33.3%
cov-2 1102340
33.3%
rt-pcr 587610
17.8%
antigeno-rdt 514730
15.6%
influenza 2346
 
0.1%
mpox 137
 
< 0.1%
qpcr 137
 
< 0.1%
zika 99
 
< 0.1%
elisa 99
 
< 0.1%
dengue 49
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
R 2792427
11.6%
2204916
 
9.2%
S 2204781
 
9.2%
- 2204680
 
9.2%
C 1690087
 
7.0%
A 1619614
 
6.7%
T 1617070
 
6.7%
( 1102576
 
4.6%
) 1102576
 
4.6%
o 1102477
 
4.6%
Other values (18) 6415746
26.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15236974
63.3%
Space Separator 2204916
 
9.2%
Dash Punctuation 2204680
 
9.2%
Lowercase Letter 1102888
 
4.6%
Open Punctuation 1102576
 
4.6%
Close Punctuation 1102576
 
4.6%
Decimal Number 1102340
 
4.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 2792427
18.3%
S 2204781
14.5%
C 1690087
11.1%
A 1619614
10.6%
T 1617070
10.6%
V 1102340
 
7.2%
N 1034201
 
6.8%
P 587747
 
3.9%
I 517277
 
3.4%
E 517273
 
3.4%
Other values (9) 1554157
10.2%
Lowercase Letter
ValueCountFrequency (%)
o 1102477
> 99.9%
p 137
 
< 0.1%
x 137
 
< 0.1%
q 137
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2204916
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2204680
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1102576
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1102576
100.0%
Decimal Number
ValueCountFrequency (%)
2 1102340
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16339862
67.9%
Common 7717088
32.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 2792427
17.1%
S 2204781
13.5%
C 1690087
10.3%
A 1619614
9.9%
T 1617070
9.9%
o 1102477
 
6.7%
V 1102340
 
6.7%
N 1034201
 
6.3%
P 587747
 
3.6%
I 517277
 
3.2%
Other values (13) 2071841
12.7%
Common
ValueCountFrequency (%)
2204916
28.6%
- 2204680
28.6%
( 1102576
14.3%
) 1102576
14.3%
2 1102340
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24056950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 2792427
11.6%
2204916
 
9.2%
S 2204781
 
9.2%
- 2204680
 
9.2%
C 1690087
 
7.0%
A 1619614
 
6.7%
T 1617070
 
6.7%
( 1102576
 
4.6%
) 1102576
 
4.6%
o 1102477
 
4.6%
Other values (18) 6415746
26.7%

Resultado
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
NEGATIVO
824751 
POSITIVO
278283 
NO SE PROCESO
 
1331
REPETIR
 
607

Length

Max length13
Median length8
Mean length8.0054734
Min length7

Characters and Unicode

Total characters8845824
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNEGATIVO
2nd rowPOSITIVO
3rd rowNEGATIVO
4th rowPOSITIVO
5th rowPOSITIVO

Common Values

ValueCountFrequency (%)
NEGATIVO 824751
74.6%
POSITIVO 278283
 
25.2%
NO SE PROCESO 1331
 
0.1%
REPETIR 607
 
0.1%

Length

2023-07-09T09:27:49.627339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:49.727798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
negativo 824751
74.5%
positivo 278283
 
25.1%
no 1331
 
0.1%
se 1331
 
0.1%
proceso 1331
 
0.1%
repetir 607
 
0.1%

Most occurring characters

ValueCountFrequency (%)
O 1385310
15.7%
I 1381924
15.6%
T 1103641
12.5%
V 1103034
12.5%
E 828627
9.4%
N 826082
9.3%
G 824751
9.3%
A 824751
9.3%
S 280945
 
3.2%
P 280221
 
3.2%
Other values (3) 6538
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8843162
> 99.9%
Space Separator 2662
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 1385310
15.7%
I 1381924
15.6%
T 1103641
12.5%
V 1103034
12.5%
E 828627
9.4%
N 826082
9.3%
G 824751
9.3%
A 824751
9.3%
S 280945
 
3.2%
P 280221
 
3.2%
Other values (2) 3876
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2662
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8843162
> 99.9%
Common 2662
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 1385310
15.7%
I 1381924
15.6%
T 1103641
12.5%
V 1103034
12.5%
E 828627
9.4%
N 826082
9.3%
G 824751
9.3%
A 824751
9.3%
S 280945
 
3.2%
P 280221
 
3.2%
Other values (2) 3876
 
< 0.1%
Common
ValueCountFrequency (%)
2662
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8845824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 1385310
15.7%
I 1381924
15.6%
T 1103641
12.5%
V 1103034
12.5%
E 828627
9.4%
N 826082
9.3%
G 824751
9.3%
A 824751
9.3%
S 280945
 
3.2%
P 280221
 
3.2%
Other values (3) 6538
 
0.1%

CT
Text

MISSING 

Distinct4010
Distinct (%)2.6%
Missing951212
Missing (%)86.1%
Memory size8.4 MiB
2023-07-09T09:27:49.938296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length5
Mean length4.935666
Min length1

Characters and Unicode

Total characters758908
Distinct characters65
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique664 ?
Unique (%)0.4%

Sample

1st row25.36
2nd row29.37
3rd row22.44
4th row36.47
5th row27.06
ValueCountFrequency (%)
no 288
 
0.2%
36.02 180
 
0.1%
35.02 174
 
0.1%
36.9 174
 
0.1%
36.06 169
 
0.1%
36.04 166
 
0.1%
36.08 163
 
0.1%
34.09 162
 
0.1%
36.54 161
 
0.1%
36.07 161
 
0.1%
Other values (3970) 152977
98.8%
2023-07-09T09:27:50.362606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 152659
20.1%
3 127041
16.7%
2 86567
11.4%
1 66373
8.7%
6 51358
 
6.8%
5 49285
 
6.5%
4 47377
 
6.2%
7 45746
 
6.0%
0 44169
 
5.8%
9 40608
 
5.4%
Other values (55) 47725
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 597705
78.8%
Other Punctuation 152788
 
20.1%
Uppercase Letter 6944
 
0.9%
Space Separator 1100
 
0.1%
Lowercase Letter 211
 
< 0.1%
Dash Punctuation 157
 
< 0.1%
Modifier Symbol 1
 
< 0.1%
Math Symbol 1
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 884
12.7%
O 843
12.1%
I 802
11.5%
N 740
10.7%
A 535
7.7%
C 519
 
7.5%
R 362
 
5.2%
T 362
 
5.2%
S 329
 
4.7%
M 242
 
3.5%
Other values (13) 1326
19.1%
Lowercase Letter
ValueCountFrequency (%)
o 28
13.3%
e 27
12.8%
i 24
11.4%
a 18
8.5%
c 17
8.1%
n 16
 
7.6%
s 13
 
6.2%
t 11
 
5.2%
f 9
 
4.3%
h 8
 
3.8%
Other values (13) 40
19.0%
Decimal Number
ValueCountFrequency (%)
3 127041
21.3%
2 86567
14.5%
1 66373
11.1%
6 51358
8.6%
5 49285
 
8.2%
4 47377
 
7.9%
7 45746
 
7.7%
0 44169
 
7.4%
9 40608
 
6.8%
8 39181
 
6.6%
Other Punctuation
ValueCountFrequency (%)
. 152659
99.9%
, 122
 
0.1%
' 6
 
< 0.1%
/ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1100
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 157
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 751752
99.1%
Latin 7156
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 884
12.4%
O 843
11.8%
I 802
11.2%
N 740
10.3%
A 535
 
7.5%
C 519
 
7.3%
R 362
 
5.1%
T 362
 
5.1%
S 329
 
4.6%
M 242
 
3.4%
Other values (37) 1538
21.5%
Common
ValueCountFrequency (%)
. 152659
20.3%
3 127041
16.9%
2 86567
11.5%
1 66373
8.8%
6 51358
 
6.8%
5 49285
 
6.6%
4 47377
 
6.3%
7 45746
 
6.1%
0 44169
 
5.9%
9 40608
 
5.4%
Other values (8) 40569
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 758906
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 152659
20.1%
3 127041
16.7%
2 86567
11.4%
1 66373
8.7%
6 51358
 
6.8%
5 49285
 
6.5%
4 47377
 
6.2%
7 45746
 
6.0%
0 44169
 
5.8%
9 40608
 
5.4%
Other values (53) 47723
 
6.3%
None
ValueCountFrequency (%)
´ 1
50.0%
º 1
50.0%
Distinct207763
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
Minimum1927-03-20 00:00:00
Maximum2023-07-04 17:06:08.083000
2023-07-09T09:27:50.513950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:50.646601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

FechaInicioSintomas
Date

MISSING 

Distinct1484
Distinct (%)0.2%
Missing210319
Missing (%)19.0%
Memory size8.4 MiB
Minimum1800-02-04 00:00:00
Maximum2023-07-04 00:00:00
2023-07-09T09:27:50.777850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:50.907843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1316
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
Minimum1927-03-20 00:00:00
Maximum2023-07-04 00:00:00
2023-07-09T09:27:51.037086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:51.165501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

FK_EstablecimientoId
Real number (ℝ)

Distinct194
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48495.987
Minimum19
Maximum90286
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:51.304596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19
5-th percentile1997
Q11997
median84051
Q390001
95-th percentile90148
Maximum90286
Range90267
Interquartile range (IQR)88004

Descriptive statistics

Standard deviation42195.383
Coefficient of variation (CV)0.8700799
Kurtosis-1.9526031
Mean48495.987
Median Absolute Deviation (MAD)6097
Skewness-0.16883465
Sum5.3586708 × 1010
Variance1.7804504 × 109
MonotonicityNot monotonic
2023-07-09T09:27:51.440576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1997 353300
32.0%
84093 122087
 
11.0%
90001 61251
 
5.5%
84051 59141
 
5.4%
84085 54489
 
4.9%
90148 43250
 
3.9%
90004 31805
 
2.9%
311 19475
 
1.8%
90037 19253
 
1.7%
90040 17293
 
1.6%
Other values (184) 323628
29.3%
ValueCountFrequency (%)
19 2258
0.2%
27 1293
0.1%
35 642
 
0.1%
51 1775
0.2%
60 1096
0.1%
108 1667
0.2%
116 5
 
< 0.1%
124 1812
0.2%
132 1427
0.1%
141 1004
0.1%
ValueCountFrequency (%)
90286 902
 
0.1%
90285 43
 
< 0.1%
90277 1
 
< 0.1%
90274 2
 
< 0.1%
90269 15
 
< 0.1%
90265 155
 
< 0.1%
90264 26
 
< 0.1%
90263 2
 
< 0.1%
90207 2265
0.2%
90205 2309
0.2%
Distinct194
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:51.633058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length52
Mean length38.535752
Min length17

Characters and Unicode

Total characters42580927
Distinct characters62
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA
2nd row(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA
3rd row(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA
4th row(84051)(LAB RE) LAB REGIONAL DE ATLANTIDA
5th row(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA
ValueCountFrequency (%)
de 725395
 
13.8%
laboratorio 404407
 
7.7%
1997)(vir 353300
 
6.7%
nacional 353300
 
6.7%
virologia 353300
 
6.7%
lab 260090
 
5.0%
re 260090
 
5.0%
regional 258907
 
4.9%
cortes 132879
 
2.5%
84093)(lab 122087
 
2.3%
Other values (437) 2021227
38.5%
2023-07-09T09:27:51.992159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 4407224
 
10.4%
4140334
 
9.7%
O 3320270
 
7.8%
R 3069854
 
7.2%
I 2799585
 
6.6%
L 2617826
 
6.1%
E 2382880
 
5.6%
( 2256118
 
5.3%
) 2256118
 
5.3%
N 1632364
 
3.8%
Other values (52) 13698354
32.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 28713661
67.4%
Decimal Number 4975599
 
11.7%
Space Separator 4140334
 
9.7%
Open Punctuation 2256118
 
5.3%
Close Punctuation 2256118
 
5.3%
Other Punctuation 211395
 
0.5%
Lowercase Letter 26629
 
0.1%
Dash Punctuation 1073
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 4407224
15.3%
O 3320270
11.6%
R 3069854
10.7%
I 2799585
9.8%
L 2617826
9.1%
E 2382880
8.3%
N 1632364
 
5.7%
T 1488443
 
5.2%
D 1077465
 
3.8%
C 1056705
 
3.7%
Other values (17) 4861045
16.9%
Lowercase Letter
ValueCountFrequency (%)
o 4694
17.6%
a 4597
17.3%
l 3809
14.3%
i 2599
9.8%
r 2293
8.6%
t 2203
8.3%
e 1806
 
6.8%
b 1146
 
4.3%
s 1072
 
4.0%
d 903
 
3.4%
Other values (9) 1507
 
5.7%
Decimal Number
ValueCountFrequency (%)
9 1222078
24.6%
0 1024344
20.6%
1 695670
14.0%
7 488678
 
9.8%
4 467999
 
9.4%
8 426940
 
8.6%
3 240191
 
4.8%
5 213517
 
4.3%
2 124167
 
2.5%
6 72015
 
1.4%
Other Punctuation
ValueCountFrequency (%)
. 200379
94.8%
, 11016
 
5.2%
Space Separator
ValueCountFrequency (%)
4140334
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2256118
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2256118
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1073
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28740290
67.5%
Common 13840637
32.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 4407224
15.3%
O 3320270
11.6%
R 3069854
10.7%
I 2799585
9.7%
L 2617826
9.1%
E 2382880
8.3%
N 1632364
 
5.7%
T 1488443
 
5.2%
D 1077465
 
3.7%
C 1056705
 
3.7%
Other values (36) 4887674
17.0%
Common
ValueCountFrequency (%)
4140334
29.9%
( 2256118
16.3%
) 2256118
16.3%
9 1222078
 
8.8%
0 1024344
 
7.4%
1 695670
 
5.0%
7 488678
 
3.5%
4 467999
 
3.4%
8 426940
 
3.1%
3 240191
 
1.7%
Other values (6) 622167
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42506507
99.8%
None 74420
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 4407224
 
10.4%
4140334
 
9.7%
O 3320270
 
7.8%
R 3069854
 
7.2%
I 2799585
 
6.6%
L 2617826
 
6.2%
E 2382880
 
5.6%
( 2256118
 
5.3%
) 2256118
 
5.3%
N 1632364
 
3.8%
Other values (50) 13623934
32.1%
None
ValueCountFrequency (%)
Á 68688
92.3%
Ñ 5732
 
7.7%

NumeroControl
Real number (ℝ)

HIGH CORRELATION 

Distinct951
Distinct (%)0.1%
Missing797
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean652.30011
Minimum262
Maximum1222
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:52.134759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum262
5-th percentile333
Q1470
median588
Q3864
95-th percentile1071
Maximum1222
Range960
Interquartile range (IQR)394

Descriptive statistics

Standard deviation236.22594
Coefficient of variation (CV)0.36214302
Kurtosis-0.82469423
Mean652.30011
Median Absolute Deviation (MAD)161
Skewness0.46124789
Sum7.2025348 × 108
Variance55802.693
MonotonicityNot monotonic
2023-07-09T09:27:52.257954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
536 4032
 
0.4%
535 3963
 
0.4%
517 3750
 
0.3%
542 3724
 
0.3%
486 3723
 
0.3%
487 3642
 
0.3%
538 3582
 
0.3%
440 3488
 
0.3%
563 3433
 
0.3%
530 3389
 
0.3%
Other values (941) 1067449
96.6%
ValueCountFrequency (%)
262 733
0.1%
264 181
 
< 0.1%
265 1037
0.1%
266 1011
0.1%
267 1092
0.1%
268 637
0.1%
269 892
0.1%
270 900
0.1%
271 1097
0.1%
272 1002
0.1%
ValueCountFrequency (%)
1222 322
< 0.1%
1221 562
0.1%
1220 225
 
< 0.1%
1219 52
 
< 0.1%
1218 380
< 0.1%
1217 403
< 0.1%
1216 362
< 0.1%
1215 581
0.1%
1214 472
< 0.1%
1213 134
 
< 0.1%
Distinct552
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:52.440380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length7.7406052
Min length4

Characters and Unicode

Total characters8553152
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)< 0.1%

Sample

1st rowARMARTINEZ
2nd rowAPERDOMO
3rd rowNFIALLOS
4th rowFAMARTINEZ
5th rowNBENAVIDES
ValueCountFrequency (%)
analiza 43250
 
3.9%
gavelar 26744
 
2.4%
mcastro 25565
 
2.3%
jmedina 20076
 
1.8%
dmartinez 19856
 
1.8%
lcarbajal 19679
 
1.8%
gcoelllo 19105
 
1.7%
aalvarado 18225
 
1.6%
acarbajal 16277
 
1.5%
nfiallos 15596
 
1.4%
Other values (542) 880599
79.7%
2023-07-09T09:27:52.765241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1501001
17.5%
E 798943
 
9.3%
R 733340
 
8.6%
L 717534
 
8.4%
O 571721
 
6.7%
N 464842
 
5.4%
I 455092
 
5.3%
S 436717
 
5.1%
D 353678
 
4.1%
C 346496
 
4.1%
Other values (25) 2173788
25.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8550335
> 99.9%
Decimal Number 2751
 
< 0.1%
Lowercase Letter 66
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1501001
17.6%
E 798943
 
9.3%
R 733340
 
8.6%
L 717534
 
8.4%
O 571721
 
6.7%
N 464842
 
5.4%
I 455092
 
5.3%
S 436717
 
5.1%
D 353678
 
4.1%
C 346496
 
4.1%
Other values (16) 2170971
25.4%
Lowercase Letter
ValueCountFrequency (%)
m 11
16.7%
e 11
16.7%
d 11
16.7%
i 11
16.7%
c 11
16.7%
o 11
16.7%
Decimal Number
ValueCountFrequency (%)
1 1383
50.3%
0 1366
49.7%
2 2
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 8550401
> 99.9%
Common 2751
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1501001
17.6%
E 798943
 
9.3%
R 733340
 
8.6%
L 717534
 
8.4%
O 571721
 
6.7%
N 464842
 
5.4%
I 455092
 
5.3%
S 436717
 
5.1%
D 353678
 
4.1%
C 346496
 
4.1%
Other values (22) 2171037
25.4%
Common
ValueCountFrequency (%)
1 1383
50.3%
0 1366
49.7%
2 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8553152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1501001
17.5%
E 798943
 
9.3%
R 733340
 
8.6%
L 717534
 
8.4%
O 571721
 
6.7%
N 464842
 
5.4%
I 455092
 
5.3%
S 436717
 
5.1%
D 353678
 
4.1%
C 346496
 
4.1%
Other values (25) 2173788
25.4%
Distinct408
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:52.932733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length7.697272
Min length5

Characters and Unicode

Total characters8505270
Distinct characters33
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowYALVARENGA
2nd rowEVALLADARES
3rd rowFACOSTA
4th rowRTORRES
5th rowMVASQUEZ
ValueCountFrequency (%)
evasquez 94657
 
8.6%
facosta 75783
 
6.9%
yalvarenga 48920
 
4.4%
evalladares 43502
 
3.9%
analiza 43250
 
3.9%
klopez 37665
 
3.4%
lflores 27787
 
2.5%
gavelar 26607
 
2.4%
iumana 26270
 
2.4%
mcastro 25791
 
2.3%
Other values (398) 654740
59.3%
2023-07-09T09:27:53.215458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1499760
17.6%
E 1000415
11.8%
L 662411
 
7.8%
R 651218
 
7.7%
S 591966
 
7.0%
O 523384
 
6.2%
N 378047
 
4.4%
Z 323740
 
3.8%
V 318976
 
3.8%
I 318356
 
3.7%
Other values (23) 2236997
26.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8505202
> 99.9%
Lowercase Letter 66
 
< 0.1%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1499760
17.6%
E 1000415
11.8%
L 662411
 
7.8%
R 651218
 
7.7%
S 591966
 
7.0%
O 523384
 
6.2%
N 378047
 
4.4%
Z 323740
 
3.8%
V 318976
 
3.8%
I 318356
 
3.7%
Other values (16) 2236929
26.3%
Lowercase Letter
ValueCountFrequency (%)
m 11
16.7%
e 11
16.7%
d 11
16.7%
i 11
16.7%
c 11
16.7%
o 11
16.7%
Decimal Number
ValueCountFrequency (%)
2 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8505268
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1499760
17.6%
E 1000415
11.8%
L 662411
 
7.8%
R 651218
 
7.7%
S 591966
 
7.0%
O 523384
 
6.2%
N 378047
 
4.4%
Z 323740
 
3.8%
V 318976
 
3.8%
I 318356
 
3.7%
Other values (22) 2236995
26.3%
Common
ValueCountFrequency (%)
2 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8505270
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1499760
17.6%
E 1000415
11.8%
L 662411
 
7.8%
R 651218
 
7.7%
S 591966
 
7.0%
O 523384
 
6.2%
N 378047
 
4.4%
Z 323740
 
3.8%
V 318976
 
3.8%
I 318356
 
3.7%
Other values (23) 2236997
26.3%

Asintomatico
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
624025 
SI
221240 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 624025
56.5%
SI 221240
 
20.0%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:53.344327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:53.438159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 624025
73.8%
si 221240
 
26.2%

Most occurring characters

ValueCountFrequency (%)
N 624025
36.9%
O 624025
36.9%
S 221240
 
13.1%
I 221240
 
13.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 624025
36.9%
O 624025
36.9%
S 221240
 
13.1%
I 221240
 
13.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 624025
36.9%
O 624025
36.9%
S 221240
 
13.1%
I 221240
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 624025
36.9%
O 624025
36.9%
S 221240
 
13.1%
I 221240
 
13.1%

Fiebre
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
564958 
SI
280307 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowSI
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 564958
51.1%
SI 280307
25.4%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:53.538579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:53.630748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 564958
66.8%
si 280307
33.2%

Most occurring characters

ValueCountFrequency (%)
N 564958
33.4%
O 564958
33.4%
S 280307
16.6%
I 280307
16.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 564958
33.4%
O 564958
33.4%
S 280307
16.6%
I 280307
16.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 564958
33.4%
O 564958
33.4%
S 280307
16.6%
I 280307
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 564958
33.4%
O 564958
33.4%
S 280307
16.6%
I 280307
16.6%

Tos
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
502493 
SI
342772 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowSI
3rd rowNO
4th rowNO
5th rowSI

Common Values

ValueCountFrequency (%)
NO 502493
45.5%
SI 342772
31.0%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:53.731316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:53.822270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 502493
59.4%
si 342772
40.6%

Most occurring characters

ValueCountFrequency (%)
N 502493
29.7%
O 502493
29.7%
S 342772
20.3%
I 342772
20.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 502493
29.7%
O 502493
29.7%
S 342772
20.3%
I 342772
20.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 502493
29.7%
O 502493
29.7%
S 342772
20.3%
I 342772
20.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 502493
29.7%
O 502493
29.7%
S 342772
20.3%
I 342772
20.3%

Disnea
Categorical

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
736050 
SI
109215 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 736050
66.6%
SI 109215
 
9.9%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:53.919467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:54.010478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 736050
87.1%
si 109215
 
12.9%

Most occurring characters

ValueCountFrequency (%)
N 736050
43.5%
O 736050
43.5%
S 109215
 
6.5%
I 109215
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 736050
43.5%
O 736050
43.5%
S 109215
 
6.5%
I 109215
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 736050
43.5%
O 736050
43.5%
S 109215
 
6.5%
I 109215
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 736050
43.5%
O 736050
43.5%
S 109215
 
6.5%
I 109215
 
6.5%

Cefalea
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
536690 
SI
308575 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowNO
3rd rowSI
4th rowSI
5th rowNO

Common Values

ValueCountFrequency (%)
NO 536690
48.6%
SI 308575
27.9%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:54.104812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:54.195808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 536690
63.5%
si 308575
36.5%

Most occurring characters

ValueCountFrequency (%)
N 536690
31.7%
O 536690
31.7%
S 308575
18.3%
I 308575
18.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 536690
31.7%
O 536690
31.7%
S 308575
18.3%
I 308575
18.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 536690
31.7%
O 536690
31.7%
S 308575
18.3%
I 308575
18.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 536690
31.7%
O 536690
31.7%
S 308575
18.3%
I 308575
18.3%

Rinorrea
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
555739 
SI
289526 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowNO
3rd rowNO
4th rowSI
5th rowSI

Common Values

ValueCountFrequency (%)
NO 555739
50.3%
SI 289526
26.2%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:54.293284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:54.385927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 555739
65.7%
si 289526
34.3%

Most occurring characters

ValueCountFrequency (%)
N 555739
32.9%
O 555739
32.9%
S 289526
17.1%
I 289526
17.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 555739
32.9%
O 555739
32.9%
S 289526
17.1%
I 289526
17.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 555739
32.9%
O 555739
32.9%
S 289526
17.1%
I 289526
17.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 555739
32.9%
O 555739
32.9%
S 289526
17.1%
I 289526
17.1%

DolorGarganta
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
562257 
SI
283008 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowSI
3rd rowNO
4th rowSI
5th rowSI

Common Values

ValueCountFrequency (%)
NO 562257
50.9%
SI 283008
25.6%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:54.485551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:54.588354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 562257
66.5%
si 283008
33.5%

Most occurring characters

ValueCountFrequency (%)
N 562257
33.3%
O 562257
33.3%
S 283008
16.7%
I 283008
16.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 562257
33.3%
O 562257
33.3%
S 283008
16.7%
I 283008
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 562257
33.3%
O 562257
33.3%
S 283008
16.7%
I 283008
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 562257
33.3%
O 562257
33.3%
S 283008
16.7%
I 283008
16.7%

DolorMuscular
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
585423 
SI
259842 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 585423
53.0%
SI 259842
23.5%
(Missing) 259707
23.5%

Length

2023-07-09T09:27:54.691609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:54.784158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 585423
69.3%
si 259842
30.7%

Most occurring characters

ValueCountFrequency (%)
N 585423
34.6%
O 585423
34.6%
S 259842
15.4%
I 259842
15.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 585423
34.6%
O 585423
34.6%
S 259842
15.4%
I 259842
15.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 585423
34.6%
O 585423
34.6%
S 259842
15.4%
I 259842
15.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 585423
34.6%
O 585423
34.6%
S 259842
15.4%
I 259842
15.4%

PerdidaOlfato
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
799986 
SI
 
45279

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSI
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 799986
72.4%
SI 45279
 
4.1%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:54.882745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:54.976267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 799986
94.6%
si 45279
 
5.4%

Most occurring characters

ValueCountFrequency (%)
N 799986
47.3%
O 799986
47.3%
S 45279
 
2.7%
I 45279
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 799986
47.3%
O 799986
47.3%
S 45279
 
2.7%
I 45279
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 799986
47.3%
O 799986
47.3%
S 45279
 
2.7%
I 45279
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 799986
47.3%
O 799986
47.3%
S 45279
 
2.7%
I 45279
 
2.7%

PerdidaGusto
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
811748 
SI
 
33517

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 811748
73.5%
SI 33517
 
3.0%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:55.071837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:55.165000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 811748
96.0%
si 33517
 
4.0%

Most occurring characters

ValueCountFrequency (%)
N 811748
48.0%
O 811748
48.0%
S 33517
 
2.0%
I 33517
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 811748
48.0%
O 811748
48.0%
S 33517
 
2.0%
I 33517
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 811748
48.0%
O 811748
48.0%
S 33517
 
2.0%
I 33517
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 811748
48.0%
O 811748
48.0%
S 33517
 
2.0%
I 33517
 
2.0%

Diarrea
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
823784 
SI
 
21481

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 823784
74.6%
SI 21481
 
1.9%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:55.266722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:55.362036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 823784
97.5%
si 21481
 
2.5%

Most occurring characters

ValueCountFrequency (%)
N 823784
48.7%
O 823784
48.7%
S 21481
 
1.3%
I 21481
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 823784
48.7%
O 823784
48.7%
S 21481
 
1.3%
I 21481
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 823784
48.7%
O 823784
48.7%
S 21481
 
1.3%
I 21481
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 823784
48.7%
O 823784
48.7%
S 21481
 
1.3%
I 21481
 
1.3%

OtroSintoma
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
829822 
SI
 
15443

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 829822
75.1%
SI 15443
 
1.4%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:55.461072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:55.554316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 829822
98.2%
si 15443
 
1.8%

Most occurring characters

ValueCountFrequency (%)
N 829822
49.1%
O 829822
49.1%
S 15443
 
0.9%
I 15443
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 829822
49.1%
O 829822
49.1%
S 15443
 
0.9%
I 15443
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 829822
49.1%
O 829822
49.1%
S 15443
 
0.9%
I 15443
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 829822
49.1%
O 829822
49.1%
S 15443
 
0.9%
I 15443
 
0.9%

EspecifiqueOtro
Text

MISSING 

Distinct3900
Distinct (%)0.5%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
2023-07-09T09:27:55.706098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length115
Median length0
Mean length0.31640846
Min length0

Characters and Unicode

Total characters267449
Distinct characters63
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2755 ?
Unique (%)0.3%

Sample

1st row
2nd row
3rd row
4th row
5th row
ValueCountFrequency (%)
dolor 2463
 
7.0%
nauseas 1877
 
5.3%
vomito 1860
 
5.3%
anosmia 1528
 
4.3%
fatiga 1409
 
4.0%
vomitos 1351
 
3.8%
mialgias 1200
 
3.4%
astenia 1135
 
3.2%
cardiopatia 1034
 
2.9%
nasal 791
 
2.2%
Other values (1986) 20658
58.5%
2023-07-09T09:27:56.035497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 41580
15.5%
I 31454
11.8%
O 27725
10.4%
S 19876
 
7.4%
E 15562
 
5.8%
T 14813
 
5.5%
R 14625
 
5.5%
N 13773
 
5.1%
12795
 
4.8%
D 11522
 
4.3%
Other values (53) 63724
23.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 249164
93.2%
Space Separator 12795
 
4.8%
Other Punctuation 4292
 
1.6%
Control 777
 
0.3%
Decimal Number 256
 
0.1%
Math Symbol 58
 
< 0.1%
Dash Punctuation 38
 
< 0.1%
Other Symbol 29
 
< 0.1%
Open Punctuation 18
 
< 0.1%
Close Punctuation 18
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 41580
16.7%
I 31454
12.6%
O 27725
11.1%
S 19876
8.0%
E 15562
 
6.2%
T 14813
 
5.9%
R 14625
 
5.9%
N 13773
 
5.5%
D 11522
 
4.6%
L 9889
 
4.0%
Other values (22) 48345
19.4%
Other Punctuation
ValueCountFrequency (%)
, 3718
86.6%
. 361
 
8.4%
/ 88
 
2.1%
: 78
 
1.8%
% 32
 
0.7%
* 9
 
0.2%
; 2
 
< 0.1%
' 1
 
< 0.1%
¿ 1
 
< 0.1%
? 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 53
20.7%
9 53
20.7%
8 47
18.4%
1 21
 
8.2%
5 18
 
7.0%
0 18
 
7.0%
2 14
 
5.5%
6 13
 
5.1%
4 10
 
3.9%
7 9
 
3.5%
Math Symbol
ValueCountFrequency (%)
+ 54
93.1%
> 2
 
3.4%
< 2
 
3.4%
Space Separator
ValueCountFrequency (%)
12795
100.0%
Control
ValueCountFrequency (%)
777
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%
Other Symbol
ValueCountFrequency (%)
° 29
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 249164
93.2%
Common 18285
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 41580
16.7%
I 31454
12.6%
O 27725
11.1%
S 19876
8.0%
E 15562
 
6.2%
T 14813
 
5.9%
R 14625
 
5.9%
N 13773
 
5.5%
D 11522
 
4.6%
L 9889
 
4.0%
Other values (22) 48345
19.4%
Common
ValueCountFrequency (%)
12795
70.0%
, 3718
 
20.3%
777
 
4.2%
. 361
 
2.0%
/ 88
 
0.5%
: 78
 
0.4%
+ 54
 
0.3%
3 53
 
0.3%
9 53
 
0.3%
8 47
 
0.3%
Other values (21) 261
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 266969
99.8%
None 480
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 41580
15.6%
I 31454
11.8%
O 27725
10.4%
S 19876
 
7.4%
E 15562
 
5.8%
T 14813
 
5.5%
R 14625
 
5.5%
N 13773
 
5.2%
12795
 
4.8%
D 11522
 
4.3%
Other values (43) 63244
23.7%
None
ValueCountFrequency (%)
Ó 220
45.8%
Ñ 86
 
17.9%
Á 84
 
17.5%
Í 33
 
6.9%
° 29
 
6.0%
Ú 15
 
3.1%
É 7
 
1.5%
´ 4
 
0.8%
¿ 1
 
0.2%
¡ 1
 
0.2%

HipertensionArterial
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
788294 
SI
 
56971

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowSI
5th rowNO

Common Values

ValueCountFrequency (%)
NO 788294
71.3%
SI 56971
 
5.2%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:56.161223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:56.250210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 788294
93.3%
si 56971
 
6.7%

Most occurring characters

ValueCountFrequency (%)
N 788294
46.6%
O 788294
46.6%
S 56971
 
3.4%
I 56971
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 788294
46.6%
O 788294
46.6%
S 56971
 
3.4%
I 56971
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 788294
46.6%
O 788294
46.6%
S 56971
 
3.4%
I 56971
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 788294
46.6%
O 788294
46.6%
S 56971
 
3.4%
I 56971
 
3.4%

Diabetes
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
811967 
SI
 
33298

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowSI
5th rowNO

Common Values

ValueCountFrequency (%)
NO 811967
73.5%
SI 33298
 
3.0%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:56.343969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:56.436201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 811967
96.1%
si 33298
 
3.9%

Most occurring characters

ValueCountFrequency (%)
N 811967
48.0%
O 811967
48.0%
S 33298
 
2.0%
I 33298
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 811967
48.0%
O 811967
48.0%
S 33298
 
2.0%
I 33298
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 811967
48.0%
O 811967
48.0%
S 33298
 
2.0%
I 33298
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 811967
48.0%
O 811967
48.0%
S 33298
 
2.0%
I 33298
 
2.0%

EnfermedadPulmonarCronica
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
839005 
SI
 
6260

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 839005
75.9%
SI 6260
 
0.6%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:56.533915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:56.626999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 839005
99.3%
si 6260
 
0.7%

Most occurring characters

ValueCountFrequency (%)
N 839005
49.6%
O 839005
49.6%
S 6260
 
0.4%
I 6260
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 839005
49.6%
O 839005
49.6%
S 6260
 
0.4%
I 6260
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 839005
49.6%
O 839005
49.6%
S 6260
 
0.4%
I 6260
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 839005
49.6%
O 839005
49.6%
S 6260
 
0.4%
I 6260
 
0.4%

Obesidad
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
827561 
SI
 
17704

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 827561
74.9%
SI 17704
 
1.6%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:56.730838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:56.826419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 827561
97.9%
si 17704
 
2.1%

Most occurring characters

ValueCountFrequency (%)
N 827561
49.0%
O 827561
49.0%
S 17704
 
1.0%
I 17704
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 827561
49.0%
O 827561
49.0%
S 17704
 
1.0%
I 17704
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 827561
49.0%
O 827561
49.0%
S 17704
 
1.0%
I 17704
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 827561
49.0%
O 827561
49.0%
S 17704
 
1.0%
I 17704
 
1.0%

Asma
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
823517 
SI
 
21748

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 823517
74.5%
SI 21748
 
2.0%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:56.923672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:57.014308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 823517
97.4%
si 21748
 
2.6%

Most occurring characters

ValueCountFrequency (%)
N 823517
48.7%
O 823517
48.7%
S 21748
 
1.3%
I 21748
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 823517
48.7%
O 823517
48.7%
S 21748
 
1.3%
I 21748
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 823517
48.7%
O 823517
48.7%
S 21748
 
1.3%
I 21748
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 823517
48.7%
O 823517
48.7%
S 21748
 
1.3%
I 21748
 
1.3%

EnfermedadRenalCronica
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
840840 
SI
 
4425

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 840840
76.1%
SI 4425
 
0.4%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:57.113919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:57.209625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 840840
99.5%
si 4425
 
0.5%

Most occurring characters

ValueCountFrequency (%)
N 840840
49.7%
O 840840
49.7%
S 4425
 
0.3%
I 4425
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 840840
49.7%
O 840840
49.7%
S 4425
 
0.3%
I 4425
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 840840
49.7%
O 840840
49.7%
S 4425
 
0.3%
I 4425
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 840840
49.7%
O 840840
49.7%
S 4425
 
0.3%
I 4425
 
0.3%

Inmunosupresion
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
840041 
SI
 
5224

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 840041
76.0%
SI 5224
 
0.5%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:57.310542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:57.451454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 840041
99.4%
si 5224
 
0.6%

Most occurring characters

ValueCountFrequency (%)
N 840041
49.7%
O 840041
49.7%
S 5224
 
0.3%
I 5224
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 840041
49.7%
O 840041
49.7%
S 5224
 
0.3%
I 5224
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 840041
49.7%
O 840041
49.7%
S 5224
 
0.3%
I 5224
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 840041
49.7%
O 840041
49.7%
S 5224
 
0.3%
I 5224
 
0.3%

AlcoholismoCronico
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
842191 
SI
 
3074

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 842191
76.2%
SI 3074
 
0.3%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:57.622315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:57.769440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 842191
99.6%
si 3074
 
0.4%

Most occurring characters

ValueCountFrequency (%)
N 842191
49.8%
O 842191
49.8%
S 3074
 
0.2%
I 3074
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 842191
49.8%
O 842191
49.8%
S 3074
 
0.2%
I 3074
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 842191
49.8%
O 842191
49.8%
S 3074
 
0.2%
I 3074
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 842191
49.8%
O 842191
49.8%
S 3074
 
0.2%
I 3074
 
0.2%

EnfermedadNeurologicaCronica
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
842316 
SI
 
2949

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 842316
76.2%
SI 2949
 
0.3%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:57.940031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:58.096124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 842316
99.7%
si 2949
 
0.3%

Most occurring characters

ValueCountFrequency (%)
N 842316
49.8%
O 842316
49.8%
S 2949
 
0.2%
I 2949
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 842316
49.8%
O 842316
49.8%
S 2949
 
0.2%
I 2949
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 842316
49.8%
O 842316
49.8%
S 2949
 
0.2%
I 2949
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 842316
49.8%
O 842316
49.8%
S 2949
 
0.2%
I 2949
 
0.2%

Tabaquismo
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
839913 
SI
 
5352

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 839913
76.0%
SI 5352
 
0.5%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:58.255205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:58.413742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 839913
99.4%
si 5352
 
0.6%

Most occurring characters

ValueCountFrequency (%)
N 839913
49.7%
O 839913
49.7%
S 5352
 
0.3%
I 5352
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 839913
49.7%
O 839913
49.7%
S 5352
 
0.3%
I 5352
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 839913
49.7%
O 839913
49.7%
S 5352
 
0.3%
I 5352
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 839913
49.7%
O 839913
49.7%
S 5352
 
0.3%
I 5352
 
0.3%

Embarazo
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Memory size8.4 MiB
NO
822369 
SI
 
22896

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1690530
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 822369
74.4%
SI 22896
 
2.1%
(Missing) 259707
 
23.5%

Length

2023-07-09T09:27:58.580848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-09T09:27:58.734404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 822369
97.3%
si 22896
 
2.7%

Most occurring characters

ValueCountFrequency (%)
N 822369
48.6%
O 822369
48.6%
S 22896
 
1.4%
I 22896
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1690530
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 822369
48.6%
O 822369
48.6%
S 22896
 
1.4%
I 22896
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 822369
48.6%
O 822369
48.6%
S 22896
 
1.4%
I 22896
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1690530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 822369
48.6%
O 822369
48.6%
S 22896
 
1.4%
I 22896
 
1.4%

SemanasGestacion
Real number (ℝ)

MISSING  ZEROS 

Distinct62
Distinct (%)< 0.1%
Missing259707
Missing (%)23.5%
Infinite0
Infinite (%)0.0%
Mean1.9639675
Minimum0
Maximum95
Zeros789118
Zeros (%)71.4%
Negative0
Negative (%)0.0%
Memory size8.4 MiB
2023-07-09T09:27:58.881739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile24
Maximum95
Range95
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.9099785
Coefficient of variation (CV)4.0275506
Kurtosis15.863918
Mean1.9639675
Median Absolute Deviation (MAD)0
Skewness4.0864059
Sum1660073
Variance62.567761
MonotonicityNot monotonic
2023-07-09T09:27:59.019776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 789118
71.4%
29 3126
 
0.3%
30 2670
 
0.2%
38 2648
 
0.2%
32 2438
 
0.2%
28 2349
 
0.2%
37 2246
 
0.2%
35 2217
 
0.2%
27 2206
 
0.2%
36 2191
 
0.2%
Other values (52) 34056
 
3.1%
(Missing) 259707
 
23.5%
ValueCountFrequency (%)
0 789118
71.4%
1 950
 
0.1%
2 708
 
0.1%
3 401
 
< 0.1%
4 506
 
< 0.1%
5 305
 
< 0.1%
6 614
 
0.1%
7 426
 
< 0.1%
8 460
 
< 0.1%
9 467
 
< 0.1%
ValueCountFrequency (%)
95 1
 
< 0.1%
87 1
 
< 0.1%
85 2
 
< 0.1%
80 1
 
< 0.1%
65 1
 
< 0.1%
57 1
 
< 0.1%
56 4
 
< 0.1%
54 1
 
< 0.1%
53 1
 
< 0.1%
52 708
0.1%
Distinct25501
Distinct (%)2.3%
Missing1
Missing (%)< 0.1%
Memory size8.4 MiB
2023-07-09T09:27:59.266460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length166
Median length0
Mean length4.5981261
Min length0

Characters and Unicode

Total characters5080796
Distinct characters80
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20758 ?
Unique (%)1.9%

Sample

1st row
2nd row
3rd row
4th row
5th row
ValueCountFrequency (%)
asintomatico 217272
35.3%
dosis 38343
 
6.2%
de 26407
 
4.3%
pfizer 25220
 
4.1%
2 22056
 
3.6%
3 13181
 
2.1%
moderna 12264
 
2.0%
astrazeneca 9961
 
1.6%
1 7214
 
1.2%
covid 6986
 
1.1%
Other values (9172) 236033
38.4%
2023-07-09T09:27:59.681849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 668349
13.2%
I 614622
12.1%
O 610080
12.0%
T 513175
10.1%
S 364543
7.2%
N 336944
 
6.6%
C 315125
 
6.2%
290097
 
5.7%
M 269902
 
5.3%
E 193757
 
3.8%
Other values (70) 904202
17.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4508171
88.7%
Space Separator 290097
 
5.7%
Decimal Number 183525
 
3.6%
Other Punctuation 70084
 
1.4%
Dash Punctuation 10136
 
0.2%
Control 8790
 
0.2%
Close Punctuation 3708
 
0.1%
Open Punctuation 3706
 
0.1%
Other Letter 1526
 
< 0.1%
Math Symbol 629
 
< 0.1%
Other values (6) 424
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 668349
14.8%
I 614622
13.6%
O 610080
13.5%
T 513175
11.4%
S 364543
8.1%
N 336944
7.5%
C 315125
7.0%
M 269902
6.0%
E 193757
 
4.3%
D 131107
 
2.9%
Other values (25) 490567
10.9%
Other Punctuation
ValueCountFrequency (%)
, 33261
47.5%
/ 22754
32.5%
. 7743
 
11.0%
# 3247
 
4.6%
: 2096
 
3.0%
; 455
 
0.6%
? 417
 
0.6%
* 85
 
0.1%
" 8
 
< 0.1%
& 7
 
< 0.1%
Other values (4) 11
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 66711
36.3%
1 38949
21.2%
3 22249
 
12.1%
0 19601
 
10.7%
9 9546
 
5.2%
4 6789
 
3.7%
8 5328
 
2.9%
5 4890
 
2.7%
7 4842
 
2.6%
6 4620
 
2.5%
Math Symbol
ValueCountFrequency (%)
+ 618
98.3%
| 6
 
1.0%
= 4
 
0.6%
< 1
 
0.2%
Control
ValueCountFrequency (%)
8779
99.9%
11
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 3705
> 99.9%
{ 1
 
< 0.1%
Other Letter
ValueCountFrequency (%)
ª 994
65.1%
º 532
34.9%
Other Symbol
ValueCountFrequency (%)
° 285
99.7%
😁 1
 
0.3%
Modifier Symbol
ValueCountFrequency (%)
´ 15
83.3%
` 3
 
16.7%
Space Separator
ValueCountFrequency (%)
290097
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10136
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3708
100.0%
Other Number
ValueCountFrequency (%)
111
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 4
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4509697
88.8%
Common 571099
 
11.2%

Most frequent character per script

Common
ValueCountFrequency (%)
290097
50.8%
2 66711
 
11.7%
1 38949
 
6.8%
, 33261
 
5.8%
/ 22754
 
4.0%
3 22249
 
3.9%
0 19601
 
3.4%
- 10136
 
1.8%
9 9546
 
1.7%
8779
 
1.5%
Other values (33) 49016
 
8.6%
Latin
ValueCountFrequency (%)
A 668349
14.8%
I 614622
13.6%
O 610080
13.5%
T 513175
11.4%
S 364543
8.1%
N 336944
7.5%
C 315125
7.0%
M 269902
6.0%
E 193757
 
4.3%
D 131107
 
2.9%
Other values (27) 492093
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5077181
99.9%
None 3613
 
0.1%
Punctuation 1
 
< 0.1%
Emoticons 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 668349
13.2%
I 614622
12.1%
O 610080
12.0%
T 513175
10.1%
S 364543
7.2%
N 336944
 
6.6%
C 315125
 
6.2%
290097
 
5.7%
M 269902
 
5.3%
E 193757
 
3.8%
Other values (53) 900587
17.7%
None
ValueCountFrequency (%)
ª 994
27.5%
Á 803
22.2%
Í 550
15.2%
º 532
14.7%
° 285
 
7.9%
Ó 118
 
3.3%
111
 
3.1%
É 85
 
2.4%
Ñ 82
 
2.3%
À 18
 
0.5%
Other values (5) 35
 
1.0%
Punctuation
ValueCountFrequency (%)
1
100.0%
Emoticons
ValueCountFrequency (%)
😁 1
100.0%

Interactions

2023-07-09T09:27:03.892766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:54.746877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:56.427818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:57.973835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:59.594643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:01.998543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:04.175796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:55.047067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:56.700403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:58.241907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:00.012582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:02.358709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:04.444358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:55.327397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:56.957712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:58.506509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:00.472912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:02.709329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:04.702496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:55.603280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:57.225454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:58.767879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:00.846584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:03.071001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:04.977597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:55.876891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:57.497486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:59.037734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:01.166192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:03.353975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:05.410485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:56.172578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:57.718147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:26:59.290126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:01.510439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-09T09:27:03.636596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-07-09T09:27:59.823897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
OrdenLaboratorioDetalleIdEdadRegionFK_EstablecimientoIdNumeroControlSemanasGestacionTipoEdadSexoDepartamentoDepartamentoResidenciaPruebaResultadoAsintomaticoFiebreTosDisneaCefaleaRinorreaDolorGargantaDolorMuscularPerdidaOlfatoPerdidaGustoDiarreaOtroSintomaHipertensionArterialDiabetesEnfermedadPulmonarCronicaObesidadAsmaEnfermedadRenalCronicaInmunosupresionAlcoholismoCronicoEnfermedadNeurologicaCronicaTabaquismoEmbarazo
OrdenLaboratorioDetalleId1.000-0.0350.0810.3520.999-0.0770.0410.0400.1440.1360.2690.1360.1580.1210.1110.0490.1110.0930.0750.1120.1710.1410.0760.0660.0200.0200.0290.0320.0190.0200.0310.0160.0150.0160.046
Edad-0.0351.0000.073-0.037-0.035-0.0370.0360.0150.0370.0460.0110.0120.0140.0360.0040.0810.0380.0550.0500.0220.0150.0110.0130.0100.2900.2210.1240.0310.0020.0710.0420.0210.0270.0270.079
Region0.0810.0731.0000.0050.082-0.1170.0390.0210.9520.9440.0980.0420.2040.2030.1970.1440.1740.1750.1620.1850.1420.1330.0950.0720.1040.0680.0530.0510.0590.0280.0450.0260.0240.0330.182
FK_EstablecimientoId0.352-0.0370.0051.0000.352-0.0400.0140.0170.3130.2810.1680.0340.0770.0520.0470.0510.0610.0550.0680.0610.0140.0090.0140.0220.0460.0390.0370.0160.0280.0150.0080.0240.0210.0210.081
NumeroControl0.999-0.0350.0820.3521.000-0.0780.0510.0430.1440.1340.2750.1310.1760.1230.1300.0410.1380.1230.1270.1360.1640.1290.0700.0650.0280.0180.0300.0320.0160.0190.0290.0090.0140.0090.052
SemanasGestacion-0.077-0.037-0.117-0.040-0.0781.0000.0100.0250.1320.1300.0090.0330.0370.0720.0750.0290.0870.0640.0570.0840.0730.0640.0430.0440.0130.0100.0130.0100.0120.0080.0060.0060.0080.0040.239
TipoEdad0.0410.0360.0390.0140.0510.0101.0000.0270.0590.0660.0310.0270.0390.0580.0550.1220.0780.0240.0750.0750.0290.0240.0070.0020.0350.0270.0070.0190.0160.0070.0050.0070.0010.0100.018
Sexo0.0400.0150.0210.0170.0430.0250.0271.0000.0560.0550.0390.0060.0110.0190.0030.0030.0430.0110.0320.0180.0110.0110.0010.0070.0390.0240.0020.0240.0310.0090.0130.0540.0040.0640.134
Departamento0.1440.0370.9520.3130.1440.1320.0590.0561.0000.9390.1600.0710.2540.2440.2360.1660.2200.2160.2060.2250.1600.1490.1040.0860.1120.0700.0580.0630.0660.0360.0370.0310.0230.0380.250
DepartamentoResidencia0.1360.0460.9440.2810.1340.1300.0660.0550.9391.0000.1510.0710.2650.2370.2280.1660.2140.2130.2020.2170.1530.1430.1010.0840.1130.0720.0560.0660.0660.0380.0370.0310.0250.0390.241
Prueba0.2690.0110.0980.1680.2750.0090.0310.0390.1600.1511.0000.0580.0540.0540.0570.0530.0320.0520.0500.0310.0090.0140.0050.0090.0090.0110.0040.0000.0080.0170.0170.0010.0070.0060.074
Resultado0.1360.0120.0420.0340.1310.0330.0270.0060.0710.0710.0581.0000.1710.1600.1650.0240.1540.1360.1440.1720.1500.1230.0430.0230.0260.0140.0220.0220.0070.0150.0150.0150.0090.0090.057
Asintomatico0.1580.0140.2040.0770.1760.0370.0390.0110.2540.2650.0540.1711.0000.4180.4900.2280.4500.4290.4210.3960.1420.1210.0960.0790.0760.0530.0420.0540.0650.0140.0080.0090.0000.0240.129
Fiebre0.1210.0360.2030.0520.1230.0720.0580.0190.2440.2370.0540.1600.4181.0000.5310.2910.4700.4250.4110.4810.2110.1840.1360.0780.0560.0490.0320.0490.0590.0160.0070.0100.0110.0200.053
Tos0.1110.0040.1970.0470.1300.0750.0550.0030.2360.2280.0570.1650.4900.5311.0000.3380.4900.5580.5040.4380.2000.1680.1050.0630.1110.0780.0700.0740.0960.0200.0090.0170.0100.0360.059
Disnea0.0490.0810.1440.0510.0410.0290.1220.0030.1660.1660.0530.0240.2280.2910.3381.0000.1870.1910.1500.1790.1350.1180.0760.0550.1000.0900.1500.0620.1130.0640.0260.0330.0280.0430.033
Cefalea0.1110.0380.1740.0610.1380.0870.0780.0430.2200.2140.0320.1540.4500.4700.4900.1871.0000.5260.5590.5710.2130.1830.1330.0800.0920.0570.0160.0700.0680.0000.0000.0060.0000.0260.055
Rinorrea0.0930.0550.1750.0550.1230.0640.0240.0110.2160.2130.0520.1360.4290.4250.5580.1910.5261.0000.5430.4550.1970.1630.1020.0490.0680.0380.0020.0540.0730.0090.0040.0040.0000.0220.047
DolorGarganta0.0750.0500.1620.0680.1270.0570.0750.0320.2060.2020.0500.1440.4210.4110.5040.1500.5590.5431.0000.5610.2020.1670.1100.0340.0780.0430.0020.0610.0640.0110.0060.0000.0030.0200.058
DolorMuscular0.1120.0220.1850.0610.1360.0840.0750.0180.2250.2170.0310.1720.3960.4810.4380.1790.5710.4550.5611.0000.2440.2090.1470.0640.0860.0560.0120.0650.0590.0010.0020.0080.0000.0290.057
PerdidaOlfato0.1710.0150.1420.0140.1640.0730.0290.0110.1600.1530.0090.1500.1420.2110.2000.1350.2130.1970.2020.2441.0000.6320.1960.0440.0220.0130.0000.0310.0210.0050.0050.0030.0010.0120.019
PerdidaGusto0.1410.0110.1330.0090.1290.0640.0240.0110.1490.1430.0140.1230.1210.1840.1680.1180.1830.1630.1670.2090.6321.0000.2350.0470.0270.0150.0000.0300.0190.0020.0020.0050.0000.0140.016
Diarrea0.0760.0130.0950.0140.0700.0430.0070.0010.1040.1010.0050.0430.0960.1360.1050.0760.1330.1020.1100.1470.1960.2351.0000.0990.0310.0230.0020.0320.0210.0030.0070.0100.0040.0160.017
OtroSintoma0.0660.0100.0720.0220.0650.0440.0020.0070.0860.0840.0090.0230.0790.0780.0630.0550.0800.0490.0340.0640.0440.0470.0991.0000.0340.0300.0360.0320.0200.0070.0180.0100.0040.0130.012
HipertensionArterial0.0200.2900.1040.0460.0280.0130.0350.0390.1120.1130.0090.0260.0760.0560.1110.1000.0920.0680.0780.0860.0220.0270.0310.0341.0000.3700.1170.1480.0490.1070.0390.0200.0270.0290.038
Diabetes0.0200.2210.0680.0390.0180.0100.0270.0240.0700.0720.0110.0140.0530.0490.0780.0900.0570.0380.0430.0560.0130.0150.0230.0300.3701.0000.0880.1310.0370.1260.0390.0210.0230.0220.026
EnfermedadPulmonarCronica0.0290.1240.0530.0370.0300.0130.0070.0020.0580.0560.0040.0220.0420.0320.0700.1500.0160.0020.0020.0120.0000.0000.0020.0360.1170.0881.0000.0470.0440.0800.0410.0430.0340.0920.012
Obesidad0.0320.0310.0510.0160.0320.0100.0190.0240.0630.0660.0000.0220.0540.0490.0740.0620.0700.0540.0610.0650.0310.0300.0320.0320.1480.1310.0471.0000.0670.0420.0230.0360.0140.0530.012
Asma0.0190.0020.0590.0280.0160.0120.0160.0310.0660.0660.0080.0070.0650.0590.0960.1130.0680.0730.0640.0590.0210.0190.0210.0200.0490.0370.0440.0671.0000.0110.0070.0100.0060.0170.013
EnfermedadRenalCronica0.0200.0710.0280.0150.0190.0080.0070.0090.0360.0380.0170.0150.0140.0160.0200.0640.0000.0090.0110.0010.0050.0020.0030.0070.1070.1260.0800.0420.0111.0000.0470.0330.0260.0240.009
Inmunosupresion0.0310.0420.0450.0080.0290.0060.0050.0130.0370.0370.0170.0150.0080.0070.0090.0260.0000.0040.0060.0020.0050.0020.0070.0180.0390.0390.0410.0230.0070.0471.0000.0270.0290.0170.008
AlcoholismoCronico0.0160.0210.0260.0240.0090.0060.0070.0540.0310.0310.0010.0150.0090.0100.0170.0330.0060.0040.0000.0080.0030.0050.0100.0100.0200.0210.0430.0360.0100.0330.0271.0000.0220.3010.007
EnfermedadNeurologicaCronica0.0150.0270.0240.0210.0140.0080.0010.0040.0230.0250.0070.0090.0000.0110.0100.0280.0000.0000.0030.0000.0010.0000.0040.0040.0270.0230.0340.0140.0060.0260.0290.0221.0000.0240.004
Tabaquismo0.0160.0270.0330.0210.0090.0040.0100.0640.0380.0390.0060.0090.0240.0200.0360.0430.0260.0220.0200.0290.0120.0140.0160.0130.0290.0220.0920.0530.0170.0240.0170.3010.0241.0000.011
Embarazo0.0460.0790.1820.0810.0520.2390.0180.1340.2500.2410.0740.0570.1290.0530.0590.0330.0550.0470.0580.0570.0190.0160.0170.0120.0380.0260.0120.0120.0130.0090.0080.0070.0040.0111.000

Missing values

2023-07-09T09:27:08.977899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-09T09:27:15.747658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-07-09T09:27:30.980025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

OrdenLaboratorioDetalleIdFechaRecepcionMuestraCodigoMuestraExpedienteNombreCompletoEdadTipoEdadSexoOcupacionRegionDepartamentoMunicipioDepartamentoResidenciaMunicipioResidenciaLocalidadTelefonoES_NotificantePruebaResultadoCTFechaResultadoFechaInicioSintomasFechaTomaMuestraFK_EstablecimientoIdLaboratorioNumeroControlUsuarioRecepcionUsuarioResultadoAsintomaticoFiebreTosDisneaCefaleaRinorreaDolorGargantaDolorMuscularPerdidaOlfatoPerdidaGustoDiarreaOtroSintomaEspecifiqueOtroHipertensionArterialDiabetesEnfermedadPulmonarCronicaObesidadAsmaEnfermedadRenalCronicaInmunosupresionAlcoholismoCronicoEnfermedadNeurologicaCronicaTabaquismoEmbarazoSemanasGestacionObservaciones
0406692020-12-16 16:31:46.653LNV 2107780801197409314CARLOS ROBERTO MEDINA ACOSTA46AÑOSHOMBREINGENIERO ELECTRICISTA19FRANCISCO MORAZANDISTRITO CENTRALNoneNoneLOMA LINDANone(90040)(TJE) TRIAJE CCGSARS CoV-2 (RT-PCR)NEGATIVONone2020-12-28 12:28:28.3402020-12-152020-12-151997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA303.0ARMARTINEZYALVARENGANoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
1494462021-01-09 13:05:26.620LNV 2190640801197200856RICARDO FRANCISCO GONZALEZ MEJIA48AÑOSHOMBREINGENIERO ELECTRICISTA19FRANCISCO MORAZANDISTRITO CENTRALNoneNoneLUIS LANDANone(90040)(TJE) TRIAJE CCGSARS CoV-2 (RT-PCR)POSITIVO25.362021-01-10 11:42:23.3302021-01-072021-01-071997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA316.0APERDOMOEVALLADARESNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
21145192021-02-17 08:33:43.653LNV 263159P-17519WILLIAM JAMES LORENZ77AÑOSHOMBREINGENIERO ELECTRICISTA2COLONTRUJILLONoneNoneJERICONone(90059)(TJE) TRIAJE TRUJILLOSARS CoV-2 (RT-PCR)NEGATIVONone2021-02-18 11:13:51.1502021-02-152021-02-151997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA355.0NFIALLOSFACOSTANoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
32836442021-06-01 15:30:09.847LRA 147360209197901312CARLOS ALFREDO MIRANDA SABIO41AÑOSHOMBREINGENIERO ELECTRICISTA2COLONTRUJILLONoneNoneCAPIRONone(90059)(TJE) TRIAJE TRUJILLOSARS CoV-2 (RT-PCR)POSITIVONone2021-06-02 11:50:01.6432021-05-142021-05-3184051(84051)(LAB RE) LAB REGIONAL DE ATLANTIDA459.0FAMARTINEZRTORRESNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
42731332021-05-26 13:26:08.503LNV 3597361709198200981DOUGLAS REINALDO ZERON JUAREZ38AÑOSHOMBREINGENIERO CIVIL17VALLESAN LORENZONoneNoneEL CENTRO SAN LORENZO, VALLENone(8982)(H.ARE) SAN LORENZOSARS CoV-2 (RT-PCR)POSITIVO29.372021-05-27 11:24:26.2132021-05-112021-05-221997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA453.0NBENAVIDESMVASQUEZNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
51841202021-04-09 10:57:28.633LNV 3097910703199102116WENDI JAHAIRA MENDEZ HERNANDEZ29AÑOSMUJERINGENIERO CIVIL7EL PARAISODANLINoneNoneEL ARENAL, DANLI.None(90053)(TJE) TRIAJE PEDRO NUFIOSARS CoV-2 (RT-PCR)POSITIVO22.442021-04-10 10:51:45.6172021-03-262021-04-071997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA406.0ECASTROYALVARENGANoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
61880322021-04-12 10:37:26.810LNV 3111220703199304057BRYAN JOSUE GOMEZ IRIAS27AÑOSHOMBREINGENIERO CIVIL7EL PARAISODANLINoneNoneBO EL CARMELO, DANLI.None(90053)(TJE) TRIAJE PEDRO NUFIOSARS CoV-2 (RT-PCR)POSITIVO36.472021-04-13 10:14:59.1032021-04-052021-04-091997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA409.0ECASTROEVASQUEZNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
7344862020-12-09 18:54:38.230LNV 2045940703199501731EDGAR DAVID ZAMBRANO RODRIGUEZ25AÑOSHOMBREINGENIERO CIVIL7EL PARAISODANLINoneNoneCOLONIA VILLEDA MORALESNone(90068)(RS) DEPARTAMENTAL DE EL PARAISOSARS CoV-2 (RT-PCR)NEGATIVONone2020-12-10 15:03:34.0702020-12-082020-12-081997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA285.0NFIALLOSYALVARENGANoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
81876642021-04-12 10:22:51.040LNV 3110820710199400004DENNIS ALBERTO CASTELLANOS MERLO27AÑOSHOMBREINGENIERO CIVIL7EL PARAISOPOTRERILLOSNoneNoneVILLA DE SAN FRANCISCONone(90107)(TJE) TRIAJE POTRERILLOS EL PARAISOSARS CoV-2 (RT-PCR)POSITIVO27.062021-04-13 10:02:40.2772021-04-042021-04-091997(1997)(VIR) LABORATORIO NACIONAL DE VIROLOGIA409.0ASILVAEVASQUEZNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
92529172021-05-13 14:01:03.553LRA 107920501194301530SAUL FERNANDO LANZA78AÑOSHOMBRESD1ATLANTIDALA CEIBANoneNoneSDNone(5665)(H.REG.) ATLANTIDA, LA CEIBASARS CoV-2 (RT-PCR)NEGATIVONone2021-05-15 14:37:48.2172021-04-292021-05-1384051(84051)(LAB RE) LAB REGIONAL DE ATLANTIDA441.0RMARADIAGAFCASTRONoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNaN
OrdenLaboratorioDetalleIdFechaRecepcionMuestraCodigoMuestraExpedienteNombreCompletoEdadTipoEdadSexoOcupacionRegionDepartamentoMunicipioDepartamentoResidenciaMunicipioResidenciaLocalidadTelefonoES_NotificantePruebaResultadoCTFechaResultadoFechaInicioSintomasFechaTomaMuestraFK_EstablecimientoIdLaboratorioNumeroControlUsuarioRecepcionUsuarioResultadoAsintomaticoFiebreTosDisneaCefaleaRinorreaDolorGargantaDolorMuscularPerdidaOlfatoPerdidaGustoDiarreaOtroSintomaEspecifiqueOtroHipertensionArterialDiabetesEnfermedadPulmonarCronicaObesidadAsmaEnfermedadRenalCronicaInmunosupresionAlcoholismoCronicoEnfermedadNeurologicaCronicaTabaquismoEmbarazoSemanasGestacionObservaciones
11049623916142021-07-19 10:27:54.050M 342660601201903921FANIS CAROLINA PALMA AYALA1AÑOSMUJERMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECACOLONIAS UNIDAS(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2021-07-162021-07-122021-07-1690001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA506.0GAVELARGAVELARNOSISINONONONOSINONONONONONONONONONONONONONONO0.0
11049633915532021-07-19 10:15:00.397M 342410601201900624YUSTIN NEYMAR CARRANZA AGUILAR2AÑOSHOMBREMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECAB. BRISAS DEL RIO(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2021-07-162021-07-132021-07-1690001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA506.0GAVELARGAVELARNOSISISINONONONONONONONONONONONONONONONONONONO0.0
11049647576742022-02-10 09:27:09.887M 1796150601201603119JENIFER SOFIA OCHOA AGUILAR5AÑOSMUJERMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECACOL.NAJAR(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2022-01-212022-01-152022-01-2190001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA717.0MCASTROGAVELARNOSISISISISISISINONONONONONONONONONONONONONONO0.0
11049657543292022-02-08 09:43:56.797M 177942P-131713CARLOS GAEL SEVILLA1AÑOSHOMBREMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECASD(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2022-01-202022-01-082022-01-2090001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA710.0MCASTROMCASTRONOSISISISISISISINONONONONONONONONONONONONONONO0.0
11049668642922022-06-30 10:14:04.810M 2313840601201900624YUSTIN NEYMAR CARRANZA AGUILAR2AÑOSHOMBREMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECAB. BRISAS DEL RIO(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2022-06-282022-06-202022-06-2890001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA853.0MCASTROMCASTRONOSISINONONOSINONONONONONONONONONONONONONONONO0.0
11049679709632022-08-29 09:23:42.810M 3153280608200200104MARIA JOSE SALGADO9MESESMUJERMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECASD(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2022-08-272022-08-252022-08-2790001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA949.0GAVELARMCASTRONOSISINONONONONONONONONONONONONONONONONONONONO0.0
110496811443172023-05-02 10:33:40.943M 4634700801201921690KENER DAVID RAUDALES BETANCO3AÑOSHOMBREMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECACOL.MONTECARLO(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2023-04-282023-04-282023-04-2890001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA1159.0GAVELARGAVELARNOSISISISISISISINONONONONONONONONONONONONONONO0.0
11049694110002021-07-26 11:10:35.827M 395600601202001366ALEJANDRO JOSE PALACIOS ARRIOLA1AÑOSHOMBREMENOR6CHOLUTECACHOLUTECACHOLUTECASANTA ANA DE YUSGUARESD(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2021-07-252021-07-202021-07-2590001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA513.0GAVELARGAVELARNOSISINONONONONONONONONONONONONONONONONONONONO0.0
11049706636152021-12-16 13:55:19.383M 1425590601201102654YENI MARISOL HERNANDEZ ALVAREZ11AÑOSMUJERMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECASD(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2021-12-04NaT2021-12-0490001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA657.0GAVELARGAVELARSINONONONONONONONONONONONONONONONONONONONONONO0.0ASINTOMATICO
11049717459212022-02-03 09:51:41.390M 174601P-130200AITANA CATALEYA VELASQUEZ5AÑOSMUJERMENOR6CHOLUTECACHOLUTECACHOLUTECACHOLUTECASD(3964)(H.REG.) DEL SUR, CHOLUTECASARS CoV-2 (ANTIGENO-RDT)NEGATIVONone2022-01-192022-01-112022-01-1990001(90001)(RS) DEPARTAMENTAL DE CHOLUTECA705.0MCASTROGAVELARNOSISISISISISISINONONONONONONONONONONONONONONO0.0